# Sergey Budaev

## Using Subversion to manage Office files

Because Subversion works best (and can track) plain text files, it is not well adapted for versioning normal Microsoft Office or LibreOffice/OpenOffice documents. However, both are actually zipped XML files. Therefore, it is possible both directly (binary) and using flat XML text (full version control/merge support).

## Microsoft Office

For Microsoft Office, there are extensions for Subversion: Msofficesvnf, OfficeSVN and MagnetSVN.

Also, TortoiseSVN can use native Microsoft Word "compare versions" tool to check for differences between versions. Check out the Diff-Scripts in the TortoiseSVN installation directory. Note that these scripts are js and can be blocked by corporate or university security policy: ask the IT!

### Subversion keywords

Subversion keywords (properties) can be managed in Microsoft Word files using SvnProperties4MSOffice.

If special software for adapting Office files is not used, it is recommended to use Microsoft uncompressed XML formats for all outputs. While they take more disk space (because it is unzipped), these are plain text XML, so Subversion treats them very efficiently. Also, svn keywords/tags can be used within the text without any additional tools.

## LibreOffice or OpenOffice

For LibreOffice, the easiest way is to use .fodt format for saving the document (instead of .odt or .docx), FODT is a flat XML format. A drawback is that it is unzipped and takes much more disk space. But Subversion does not store all versions of the whole file, it saves effectively differences between the versions. Therefore, there is little or no overhead within the version control system of working with fodt files.

Quite importantly, it is then trivial to add keywords to the fodt file on the svn system. Then, it is easy to include normal svn keywords/tags] such as $Revision 1234$ into whenever needed into the fodt file and it will autoupdate on every commit without any additional tools. But note that the whole tag $Revision 1234$ must have the same formatting (i.e. no bold/italic/other font within and including the  delimiters).

## Conflicts

To avoid conflicts when several people are working concurrently with svn-tracked files, use svn locks. This is because the files are like binary and cannot be easily merged, unlike normal plain text code. In fact, they could be merged, but do not always expect merge to work as expected because the text file includes complex tags and these may be broken at merge.

It is also difficult to resolve conflicts visually. A useful trick is to set this property on the file: svn propset svn:needs-lock "true" file_name.fodt. Then, any svn update will result this file becoming read-only. To allow editing, file lock must be enabled. This ensures that only one user can edit the file at a time.

## How differences between versions can be checked?

Because the Libreoffice files are not just text, checking differences is not trivial. Normal diff tool will result in lots of messy XML differences.

But there is a Linux bash script that helps comparing the files through converting FODT to PDF and then running diffpdf utility:

There is also a Windows/DOS batch script that does this trick:

The script requires diffpdf program that is found in most Linux distributions. A Windows version is open source but id not normally distributed in the binary ".exe" form

### How to use diffodt script

• Compare working copy with the latest revision from svn: diffodt paper.fodt

• Compare the working document with r9925: diffodt 9925 paper.fodt

• Compare two specific versions of the document: diffodt 9925 9987 paper.fodt

## Integrating Subversion into LibreOffice User Interface

Lo_SVN is a LibreOffice extension that adds a basic Subversion functionality into the LibreOffice interface. Then, basic svn commands are available from the LibreOffice menu.

## How to make an array and initialize it with a sequence of values in Fortran?

How do you make an array and initialize it with a sequence of values? For example, I want a list from 0.25 to 1.5 that is separated with 0.25. In other words I want something similar to seq(0.25,5,0.5) in R.

## Equally spaced real array with fixed increment in Fortran

Producing an equally spaced array from V1 to VN with increments ΔV

$\{V_1,V_2=V_1+\Delta V,V_3=V_2+\Delta V,V_4=V_3+\Delta V, ... V_N=V_{N-1}+\Delta V\}$

1. Each of the values in the above vector can be calculated as:

$V_i=V_1+\Delta V(i-1)$

2. The total number of values N in the array ending with a fixed known VN is equal to

$N=\frac{V_N-V_1}{\Delta V}+1$

3. It is not possible to use a simple piece of code like this to produce real type array in Fortran:

Array = [V1:VN:Incr]


4. Such a construction cannot be used in modern Fortran, even though old versions could accept a similar construction based on implied loop with real type index counter:

real :: r ! Index must be integer in loops!
print *, (r, r=V1,VN,Incr)


5. In modern Fortran standard do loops can only have integer indexing variable. Real indexing in do loops is one of the very few features that had been deleted from the language because it can create lots of problems in float point computations due to finite precision in computer hardware.

The old code might work with modern compilers but it may require special legacy compiler options. The printing-only code as above may still work but would issue a compiler warning.

6. Initialising such equally spaced real type arrays in Fortran implied loops must use the formulas defined in 1. and 2.

# Produce exactly N_VALS values starting from INIT with increments INCR
Array = [( INIT + INCR * (i-1), i=1,N_VALS )]


Where the number of array elements N_VALS is calculated as:

N_VALS = floor( (END - INIT) / INCR + 1 )

N_VALS = ceiling( (END - INIT) / INCR + 1 )


The floor and ceiling functions convert real value to integer as the lower or upper nearest integer; they can give different values when division cannot be done without the remainder

# All values starting from INIT with increments INCR and up to the limit END
Array = [( INIT + INCR * (i-1), i=1,floor((END-INIT)/INCR+1) )]


7. This code does not seem to be a very simple and elegant solution. Ideally, the code should be packaged into a function returning the desired grid array. But such function could not be used in declarations of array parameters. In the later case the one-liner code should be used as above.

## Integer arrays

By the way, it is quite easy to produce an integer array, e.g. here is an initialisation for array from 1 to 100 (|1,2,3,...,100|). This can be useful for indexing arrays.

integer, parameter, dimension(*) :: IDX_ARRAY = (/(i,i=1,100)/)


## Examples:

A. Produce an array of 10 values starting from 1.0 with increments 0.1

Array = [( 1.0 + (i-1) * 0.1, i=1,10 )]


Result:

1.00000000 1.10000002 1.20000005 1.29999995 1.39999998
1.50000000 1.60000002 1.70000005 1.79999995 1.90000010


Declaration of a parameter array:

real, parameter, dimension(*) :: Array = [( 1.0 + (i-1) * 0.1, i=1,10 )]


However, note that not all compilers may support assumed array size dimension(*) in such array declaration statement, this requires newer Fortran standard (fortunately, recent versions of Intel and GNU Fortran do support assumed size arrays). In such a case declaration must explicitly set the number of array elements:

real, parameter, dimension(10) :: Array = [( 1.0 + (i-1) * 0.1, i=1,10 )]


B. Produce an array of starting from 1.0 to 2.0 with increments 0.145; note that lower value (floor) for the array size is used:

Array = [( 1.0 + 0.145 * (i-1), i=1, floor((2.0-1.0)/0.145 + 1) )]


Result:

1.00000000 1.14499998 1.28999996 1.43499994 1.57999992
1.72499990 1.87000000


C. The same as (B) but the upper value (ceiling) for the array size is used:

Array = [( 1.0 + 0.145 * (i-1), i=1, ceiling((2.0-1.0)/0.145 + 1) )]


Result:

1.00000000 1.14499998 1.28999996 1.43499994 1.57999992
1.72499990 1.87000000 2.01499987


D. In the case B., declarations of parameter arrays can be done like this:

real, parameter, dimension(*) :: Array =                                  &
[( 1.0 + 0.145 * (i-1), i=1, floor((2.0-1.0)/0.145 + 1) )]


or, if the compiler does not support assumed size arrays (*), with explicitly calculated array size:

real, parameter, dimension(floor((2.0-1.0)/0.145 + 1)) :: Array =         &
[( 1.0 + 0.145 * (i-1), i=1, floor((2.0-1.0)/0.145 + 1) )]


## Test program

! This program illustrates how to produce equally spaced real vectors with
! fixed increment in Fortran.
!
! 1. Produce exactly N_VALS values starting from INIT with increments INCR
! Array = [( INIT + INCR * (i-1), i=1,N_VALS )]
!
! 2. All values starting from INIT with increments INCR and up to the limit END
! Array = [( INIT + INCR * (i-1), i=1,floor((END-INIT)/INCR+1) )]
!-------------------------------------------------------------------------------
program spaced_array

! Integer counter for implied loops defining vectors.
integer :: i

! Example A. Produce an array of 10 values
! starting from 1.0 with increments 0.1
real, parameter, dimension(*) :: Array1 = [( 1.0 + (i-1) * 0.1, i=1,10 )]

! Example B. Produce an array of starting from 1.0 to 2.0
! with increments 0.145.
! Note that lower value (floor) for the array size is used.
real, parameter, dimension(*) :: Array2 = &
[( 1.0 + 0.145 * (i-1), i=1, floor((2.0-1.0)/0.145 + 1) )]

! Example C. The same as (B) but the upper value (ceiling) for the
! array size is used.
real, parameter, dimension(*) :: Array3 = &
[( 1.0 + 0.145 * (i-1), i=1, ceiling((2.0-1.0)/0.145 + 1) )]

! Print the sizes of the arrays that were declared above.
print *, "Array sizes (Array1, Array2, Array3)", &
size(Array1), size(Array2), size(Array3)

! Print the parameter arrays that were declared above.
print *, "Array1", Array1
print *, "Array2", Array2
print *, "Array3", Array3

end program spaced_array


## How to produce a reverse of a vector in Fortran?

Let's we have a vector A, e.g.

A = [1,2,3,4,5]


How to produce a vector with reverse indices, e.g.

B = [5,4,3,2,1]


B = A( size(A):1:-1 )


To reverse A itself do

A = A( size(A):1:-1 )


## Gaussian random numbers in Fortran

The HEDTOOLS tools library has a module for working with random numbers BASE_RANDOM​. There is, in particular, a set of procedures for generating Gaussian random values: ​RNORM and RNORM_ARRAY. These are based on the Kinderman & Monahan, augmented with quadratic bounding curves method (Leva, 1992: algorithm 712, Trans. Math. Software, 18, 4, 434-435​).

I have made a quick comparison of the quality of the Gaussian random numbers generated by the simple Box-Muller method (Box & Muller, 1958​)

Classical (ancient) Fortran code:

normrand_number = dsqrt(-2.*dlog(drand(0)))*dcos(2.*pi*drand(0))


that has been used in TEG codes so far...

and the algorithm 712 as implemented in HEDTOOLS using this test program (see attachment).

Fortran code for the test program:

program test_bm
use csv_io
use base_random, rand_x => rand     ! Alias rand() as rand_x() for ifort.
!use IFPORT, only : rand_x => rand  ! This is the Intel Fortran tweak.

integer, parameter :: prec = 8, arrsize=100000
character(len=255), parameter :: filename1="file_01.csv", filename2="file_02.csv"
real(kind=prec), dimension(arrsize) :: norand1, norand2
real :: timer_start, timer_end
!-------------------------------------------------------------------------------
! Generating Box-Muller random numbers
call cpu_time(timer_start)  ! START
do i=1, arrsize
norand1(i) = sqrt(-2.*log(rand_x(0)))*cos(2.*pi*rand_x(0))
end do
call cpu_time(timer_end)    ! END
print *, "Box-Muller took: ", timer_end - timer_start
! Write random normal data to CSV
call CSV_MATRIX_WRITE(norand1, filename1)
!-------------------------------------------------------------------------------

!-------------------------------------------------------------------------------
! Generating based on algorithm 712
call cpu_time(timer_start)  ! START
call RNORM_ARRAY(norand2)
call cpu_time(timer_end)    ! END
print *, "Alg. 712 took: ", timer_end - timer_start
! Write random normal data to CSV
call CSV_MATRIX_WRITE(norand2, filename2)
!-------------------------------------------------------------------------------

end program test_bm


## Comparison of Box-Muller and A712

The alg. 712 looks slightly faster than the simple Box-Muller transform.

alg. 712 is much better, as the Box-Muller significantly deviates from the normal distribution, alg. 712 does not (using the Anderson-Darling test from the nortest R package).

  # Gaussian random numbers by Box-Muller deviate from the Normal distribution:
> ad.test(data_bm$X1) Anderson-Darling normality test data: data_bm$X1
A = 581.7, p-value < 2.2e-16
# Gaussian random numbers by Kinderman & Monahan's A712 do not deviate from the Normal distribution:
> ad.test(data_a712$X1) Anderson-Darling normality test data: data_a712$X1
A = 0.46975, p-value = 0.2474


So, the alg. 712 procedure implemented in HEDTOOLS should be used instead of the Box-Muller method.

## References

• Box, G. E. P., & Muller, M. E. (1958). A note on the generation of random normal deviates. The Annals of Mathematical Statistics, 29(2), 610–611. ​http://doi.org/10.1214/aoms/1177706645

• Leva, J. L. (1992). Algorithm 712; a normal random number generator. ACM Transactions on Mathematical Software, 18(4), 454–455. ​http://doi.org/10.1145/138351.138367