Tuesday, 7 May 2024

Fortran binary output -> Python

Since I don't really do anything with my Raspberry Pi, this blog "My Pi's Mad At Me" may as well become a more general computing blog.  And since I have spent a bit of time lately working something out, I thought I'd make a post about it.

I have a moderately large physics code (Sky3d, published here (v1.0) and here (v1.1)) which produces output that can be usefully post-processed to derive further physically interesting results.  These output files are, at the moment, written in the standard unformatted Fortran binary format, which means they can be easily read in by a Fortran code and further processed.  Unlike using, say, a text-based formatted output, there is no loss in precision from the reading and writing process when using unformatted output, though it is not human-readable, and it is the natural thing to do.

Since I now make use of Python for smallish coding jobs as much as I do of Fortran, I wanted to make sure I could read these binary files into Python, and started to look in to how to do it.  Fortran is a relatively small language in which I can, or at least could when I was using it daily, keep most of it in my mind, or know quickly where to look up.  Python is a sprawling archipelago of ideas, and ways of doing things, and sort-of-standard libraries, which is great in letting you do lots of high-level stuff, but hard for anyone not fully immersed in it to know how best to do something, especially when relying on third-party libraries.  So, my first search gave me a method within the very standard SciPy library which could read Fortran binary files, but the documentation cautions that "Since this is a non-standard file format, whose contents depend on the compiler and the endianness of the machine, caution is advised". 

So I rummanged around a bit more and decided to go to the Fortran end of things and re-write the output in a more portable way using stream i/o, which came in to Fortran in the 2003 standard, and which "allows easy interoperability with C binary stream" according to my copy of "Modern Fortran Explained".

I therefore wrote a Fortran code which fairly simply opened an existing records-based Fortran unformatted binary file for reading, and wrote out each variable one-by-one to a new file opened as a stream-based unformatted binary file. 

To see if it worked, I had to read it in to a Python program.  Here, there seem many possible options, and I went for the 'struct' library, partly as it seems to be inside the Python language and so doesn't necessitate downloading and loading and keeping track of versions of other libraries (e.g. numpy.fromfile probably does the same job, but why bring in a large library like numpy if you don't have to?)

So, for example, a binary write of an integer in Fortran like 

write(11) iter

is read in Python as

iter = struct.unpack('i',f.read(4))

where the 'i' means integer and the 4 means 4 bytes.  f is the file handle that has been opened in Python, while the 11  in the Fortran line is the unit number of the file when written out in the Fortran code.

It works nicely, but I was scratching my head for a while when dealing with what Fortran calls logical variables and in Python are booleans.  My Fortran code had been using, and writing out to file Fortran's default logical type, which turns out on my implementation of Fortran to be 4 bytes long, encoded in a slightly strange way.  Consider the following Fortran program                                               
 

program logical_test
  integer :: u
  logical :: l1=.true., l2=.true.
  open(newunit=u,file='data.stream',form='unformatted',access='stream')

  write(u) l1, l2
end program logical_test
 

which writes out two default logical variables of value .true..  When run, the resulting file data.stream is 8 bytes long and looks like this:

[~] od -x data.stream                                                 
0000000      0001    0000    0001    0000                              
0000010

so there are two sets of 4 bytes, which both look like 00 01 00 00 (in hex), and the interesting information is encoded in the second byte of 4.  Well, a weird choice, and one that can only be read in the Python method above if you know exactly how this Fortran data type is encoded, and know to read in the padding bytes and throw them away.  That's not what I want at all, and before understanding this padding I was confused why other variables printed out after the logical variable appeared to be read in as garbage.

My initial solution was to not use logical/boolean data types at all, but to encode the information as integers which I knew how to encode and transfer in the binary files.  Then I looked up the types (as Fortran calls them) in the intrinsic iso_c_binding module, and there is one called c_bool.  If I use that, then it gives a 1-byte representation and I can easily read that using the Python struct library that I showed above:

Now my test Fortran program reads:

program logical_test
  use iso_c_binding
 
  integer :: u
  logical(c_bool) :: l1=.true., l2=.true.
  open(newunit=u,file='data.stream',form='unformatted',access='stream')

  write(u) l1, l2
end program logical_test

and my data.stream file is 2 bytes long:

[~] od -x data.stream
0000000     0101                                                       

0000002

and I can read it in Python as

import struct

f=open('data.stream','rb')
print(struct.unpack('?',f.read(1))[0])
print(struct.unpack('?',f.read(1))[0])

and the result is

[~] python logical.py
True
True

and now numerical data coming after that can be read safely and I don't need to worry about where in an 4-byte string a logical value is being encoded.





Wednesday, 27 January 2021

Setting up a Samba server

My Raspberry Pi's main purpose is to host a raid array.  It does that more or less fine, though it took me a while to find a stable enough powered USB hub to host the pair of external hard drives.

To server the files to the rest of my network I connect to the Pi via sshfs on the client computer.  This has the benefit of not needing me to do anything else on the Pi, though it means installing sshfs on the client systems.  A downside is that sshfs is not supported natively on the Macs that I am connecting to the Pi with, and I need to make sure I keep sshfs working whenever e.g. a new update happens to the Mac OS.  sshfs is also an encrypted protocol, yet I don't necessarily want the overhead of encryptic massive media files which is what I usually access from the raid array.  Authentication would be nice, but encryption of the data is not really necessary in my case.  

So, I'd like to set up a samba server, which is the Windows files sharing protocol that will allow me to mount the raid data on the Pi as a shared drive on an other machine (be it Windows, Mac, Linux, or my Android tablet).  

I'm going to follow the official raspberry pi site to install and run samba.  I will document what I do here so that I can refer back to it.

The first step is to install samba via:

sudo apt update
sudo apt install samba samba-common-bin smbclient cifs-utils

doing this and saying 'yes' to the confirmation to go ahead proceeds as expected and the software installs.

Next, I jump to the "Sharing a folder for use by Windows" section of the above web page.  I already have a folder to share, which in my case is /mnt/raid.  So I edit the file /etc/samba/smb.conf  and add the following to the end:

[share]
   path = /mnt/raid
   read only = no
   public = yes
   writeable = yes 

I follow the suggestions of the web page, but that's all fine as I want the drive to be freely available to users on my home network.

There is also a suggestion to change the workgroup setting to agree with my home Windows networt.  Since this is the first thing to use my home windows network, I generate a new name for the workgroup:

   workgroup = 12leas    

Saving the file appears to be enoug for samba to pick it up, and indeed I can see a directory called "share" on my Mac:


Note that "ah-satan" is the name of my Raspberry Pi.  (It's my wife's name, backwards)

success!


Thursday, 9 March 2017

My Raspberry Pi blog

Welcome to my blog about the raspberry pi computer.  I've been using the first model pi for a while (as a raid array) and I just got a later model which I haven't taken out of the box yet.  

Having spent lots of time playing with / configuring the older pi, I thought that it would be helpful for me to start documenting what I've been doing, more so that I can refer back to what I have done, that anything else.  Having it as a blog, though, means that what I learn might just end up being useful for someone else.

The title of the blog is "My Pi's Mad at Me".  I don't really think my Pi is mad at me, but I liked the pun based on this Tracey Ullman cover of a Madness song: