Previous | Contents | Next


5. Developer Notes

The best way to learn about the structure of libmtcelledit is to study the public API header file, and the programs that use it.

5.1 Example Programs

I find libmtcelledit a very useful library when developing my own software so I thought it would be helpful to include a few of these programs with the mtCellEdit package to demonstrate what can be done.

5.1.1 mtNetLog

This program demonstrates how you can create a small console program to monitor network activity, and then use spreadsheet formulas for accounting. Data is saved as values in a TSV file so its contents can be read at the command line at any time (even while logging). You can run this program at bootup and it will record all network activity. I do this by inserting the following code at the end of file like /etc/rc.d/rc.S or /etc/rc.local :

  /usr/bin/mtnetlog -network ppp0 -delay 10 /var/log/mtnetlog_ppp0.tsv &
  /usr/bin/mtnetlog -network eth0 -delay 10 /var/log/mtnetlog_eth0.tsv &

5.1.2 mtEleana

This program demonstrates how TSV files can be used to index other input files. In this case, mtEleana is a front end to the UK General Election results from 1945 to 2015.

5.1.3 mtRaft

This program demonstrates how you can use a spreadsheet to make the following jobs easier:

5.2 Porting and Hacking mtCellEdit

mtCellEdit was designed and built on a GNU/Linux system, with the GUI created with the help of Qt4/5. It should possible to port it anywhere else (with a little effort) as it is just a C/C++ program. I don't have enough time and energy to do this porting myself but feel free to port the code if you want (according to the GNU GPL of course) and tell me or the world about your experiences.

5.2.1 User Defined Menu Keyboard Shortcuts

If you don't like the default menu keyboard shortcuts, it is possible to change them by tweaking the source code. This is a fairly easy thing to do:

5.3 Command Line Examples

Because the TSV file format is so simple its very easy to read using the command line or a script. This is completely independent of the mtCellEdit GUI or even the core libmtcelledit library. This ensures that your data is completely accessible on any properly equipped GNU operating system.

Here is a summary of some useful commands for manipulating a TSV spreadsheet:

Command Description
awk General purpose parsing, manipulation, and reporting
cat Concatenate various files together
cut Remove columns
head Output the top rows of a file
paste Paste one files data after another files data (line by line)
sed Simple regexp's
sort Sort rows
tac Reverse the order of the rows (vertical flip)
tail Output the bottom of a file
tee Read from stdin and output to stdout and files
tr Translate or delete characters
uniq Remove duplicate rows (see also sort -u)

The shell also provides the usual redirections:

> Create a new file
>> Append rows to a file

5.3.1 awk

Writing a small awk program to process a TSV sheet offers a few advantages over a C++ program. For example some simple tasks require just a few awk instructions to complete, compared to a C++ program that would need to be much larger and therefore more time consuming to create and maintain.

5.3.1.1 mtNetLog Queries

You can use the awk program to process the TSV file created by mtNetLog and present the data in a more human readable form like this:

  cat /var/log/mtnetlog_ppp0.tsv | awk -F "\t" 'NR==2{for (i=1; i<=68; i++) printf "-"; printf "\n"} NR!=2{printf "%20s %10.2fMB %10.2fMB %10.2fMB %8.2f\n", $1, $7, $8, $9, 100*$8/$7}'

Which displays the in/out/total columns and a perctange of input to output flow. This will convert the data from this file:

  		61.555277776904	253420306	30916922	330063404	301.02403259277	13.748958587646	314.77299118042
  'Start	'End	'Hours	'Bytes In	'Bytes Out	'Bytes Total	'MB In	'MB Out	'MB Total
  2010-11-10 12:10:42	2010-11-11 00:39:11	12.474722223356	36193057	11935767	86299634	73.300669670105	9.0010747909546	82.30174446106
  2010-11-14 09:59:13	2010-11-15 01:36:16	15.617500000633	15034107	2117117	70948296	67.253649711609	0.40791797637939	67.661567687988
  2010-11-19 11:58:12	2010-11-20 02:43:36	14.756666664965	68171569	6224867	36326937	32.803199768066	1.8408670425415	34.644066810608
  2010-11-20 10:09:43	2010-11-21 02:21:46	16.200833332725	84574392	1741061	56100792	52.855527877808	0.6463565826416	53.501884460449
  2010-11-21 12:29:45	2010-11-21 15:00:05	2.5055555552244	49447181	8898110	80387745	74.810985565186	1.8527421951294	76.663727760315

Into this:

                           301.02MB      13.75MB     314.77MB     4.57
  --------------------------------------------------------------------
   2010-11-10 12:10:42      73.30MB       9.00MB      82.30MB    12.28
   2010-11-14 09:59:13      67.25MB       0.41MB      67.66MB     0.61
   2010-11-19 11:58:12      32.80MB       1.84MB      34.64MB     5.61
   2010-11-20 10:09:43      52.86MB       0.65MB      53.50MB     1.22
   2010-11-21 12:29:45      74.81MB       1.85MB      76.66MB     2.48

You can also use the sort command line program in order to sort according to a given field:

  cat /var/log/mtnetlog_ppp0.tsv | tail -n+3 | sort -t $'\t' -nk 3 | awk -F "\t" '{ printf "%s\t%5.2f hours\n", $1, $3 }'

Creates this output:

  2010-11-21 12:29:45	 2.51 hours
  2010-11-10 12:10:42	12.47 hours
  2010-11-19 11:58:12	14.76 hours
  2010-11-14 09:59:13	15.62 hours
  2010-11-20 10:09:43	16.20 hours

5.3.1.2 mtEleana Queries

Awk is able to process the mtEleana data very easily due to the simplicity of the files. For example here is an awk program to calculate the total number of votes for each political party:

DOWNLOAD FILE

  # party_totals.awk
  # by Mark Tyler 6th December 2010
  # Tweaked 22nd August 2011 to be ~25% faster (let sed remove ' at end)
  
  # examples:
  
  #	cat 2010.tsv | awk -f party_totals.awk | sort | sed -e "s/^'//"
  # This counts up the totals for each political party in the 2010 general election, sorting by name of party.
  
  #	cat 2010.tsv | awk -f party_totals.awk | sort -t $'\t' -rnk 2 | head | sed -e "s/^'//"
  # This outputs a top 10 of the most popular political parties from the 2010 general election.
  
  
  BEGIN {
  	FS = "\t"
  }
  
  NR>3 {
  	if ( $5 > 0 ) list[ $4 ] += $5
  }
  
  END {
  	for ( party in list )
  	{
  		printf "%s\t%s\n", party, list[ party ]
  	}
  }

The output from the second example is as follows:

  Conservative	10703654
  Labour	8606517
  Liberal Democrat	6836248
  UK Independence Party	919471
  British National Party	564321
  Scottish National Party	491386
  Green	285612
  Independent	192963
  Sinn Fein	171942
  Democratic Unionist Party	168216

5.3.1.3 Splitting fixed width text files

If a text file contains a rectangular lump of text with various columns set up using spaces it is possible to chop this up into a TSV file ready for spreadsheet editing:

  cat fixed.txt | awk -v FIELDWIDTHS='20 11 2 11 2 11 2 9' -v OFS='\t' '{ $1=$1; print }' > fixed.tsv

This cuts up the output from the mtnetlog example in 5.3.1.1. You may also want to remove spaces using the following sections.

5.3.1.4 Removing all spaces

  cat fixed.tsv | tr -d " "

This is a very crude and indiscriminate weapon of mass destruction, so care is needed when using it. For example it destroys the space between the date and time in the example from cutting up the text from 5.3.1.1. To avoid this problem use the following example.

5.3.1.5 Removing all leading and trailing spaces

You can use sed or awk to chop off leading and trailing spaces in each cell. This is useful if you want to preserve the spaces within the cell, but remove useless spaces after splitting a fixed width file:

  cat fixed.tsv | sed -e "s/^[ ]\+//" | sed -e "s/[ ]\+$//" | sed -e "s/\t[ ]\+/\t/g" | sed -e "s/[ ]\+\t/\t/g"
  
  cat fixed.tsv | awk '{ sub( /^ +/, "" ); sub( / +$/, "" );  gsub( /\t +/, "\t" ); gsub( / +\t/, "\t" ); print }'

5.3.1.6 Creating fixed width text files

To reverse the operation in 5.3.1.3 and to create a fixed width file from a TSV file:

  cat fixed.tsv | awk -v FS='\t' '{ printf "%20s%11s%2s%11s%2s%11s%2s%9s\n", $1, $2, $3, $4, $5, $6, $7, $8 }'

In this example, we want all of the chunks of text to be right justified, but if we wanted to have them left aligned we would simply place a '-' before the number like this:

  cat fixed.tsv | awk -v FS='\t' '{ printf "%-20s%-11s%-2s%-11s%-2s%-11s%-2s%-9s\n", $1, $2, $3, $4, $5, $6, $7, $8 }'

5.3.2 unzip

The command line program unzip allows you to access the sheets held inside a CED/ZIP file. Here is an example using the example data in the file 'test_suite.tsv.zip' from the mtCellEdit source code distribution:

  unzip -p test_suite.tsv.zip "test_suite.tsv/sheet/Data (1959 GE)" | awk -F "\t" 'NR>340 {  printf "%s - %s\n", $1, $2 } NR>350 {exit}'

This outputs the first 2 columns from rows 341 to 350, which results in this being displayed:

  'King's Lynn - 'DG Bullard
  'Kingston upon Thames - 'JA Boyd-Carpenter
  'Kinross and West Perthshire - 'WG Leburn
  'Kirkcaldy - 'HPH Gourlay
  'Knutsford - 'WH Bromley-Davenport
  'Lanark - 'JCM Hart
  'Lanarkshire North - 'MM Herbison
  'Lancaster - 'HJ Berkeley
  'Leeds East - 'DW Healey
  'Leeds North East - 'K Joseph
  'Leeds North West - 'D Kaberry

As you can see it prints out the ' character at the beginning of each field which is undesirable, so we ask awk to cut this out for us using the sub function:

  unzip -p test_suite.tsv.zip "test_suite.tsv/sheet/Data (1959 GE)" | awk -F "\t" 'NR>340 { sub(/^\x27/, "", $0); gsub(/\x09\x27/, "\t", $0); printf "%s - %s\n", $1, $2 } NR>350 {exit}'

Which creates this:

  King's Lynn - DG Bullard
  Kingston upon Thames - JA Boyd-Carpenter
  Kinross and West Perthshire - WG Leburn
  Kirkcaldy - HPH Gourlay
  Knutsford - WH Bromley-Davenport
  Lanark - JCM Hart
  Lanarkshire North - MM Herbison
  Lancaster - HJ Berkeley
  Leeds East - DW Healey
  Leeds North East - K Joseph
  Leeds North West - D Kaberry

If you wanted to see the last 10 rows in a file you would simply call the tail command first:

  unzip -p test_suite.tsv.zip "test_suite.tsv/sheet/Data (1959 GE)" | tail | awk -F "\t" '{ sub(/^\x27/, "", $1); sub(/^\x27/, "", $2); printf "%s - %s\n", $1, $2 }'

Which creates this:

  Worcester - GR Ward
  Worcestershire South - PG Agnew
  Workington - TF Peart
  Worthing - OL Prior-Palmer
  Wrekin, The - W Yates
  Wrexham - JI Jones
  Wycombe - J Hall
  Yarmouth - A Fell
  Yeovil - JWW Peyton
  York - CB Longbottom

5.3.3 zcat

If you have saved some TSV data inside a compressed GZ file you cat use the zcat command to access this data. This example extracts the first and third columns from a file and stores the results in a new compressed output file:

  zcat input.tsv.gz | awk -F "\t" '{ printf "%s\t%s\n", $1, $3 }' | gzip > output.tsv.gz

5.3.4 tr

5.3.4.1 Changing Field Separators

Occasionally you may come across a file with an unusual field separator such as : ; or @. Assuming these characters only exist as field separators then its very easy to swap the characters to tabs using tr:

  cat input.txt | tr "@" "\t" > output.tsv

For exotic multi-character separators like @;@ you can use awk:

  cat input.txt | awk '{ gsub( /@;@/, "\t" ); print }' > output.tsv

You can also change a normal TSV file into a file with rather more eccentric field separators:

  cat input.tsv | tr "\t" "@" > output.txt
  cat input.tsv | awk '{ gsub( /\t/, "@;@" ); print }' > output.txt

5.4 cedutils

As mentioned in section 5.3, a GNU system can manipulate a TSV file in a number of ways. However there are some jobs that are not possible to complete quickly and easily so I have assembled a group of command line utilities that can be used by the shell to manipulate spreadsheet files.

These utilities are designed for accessing CSV or TSV files using basic scripts. If you need access to data inside a CED/ZIP file then you should unzip it to a temp directory and access the sheets there. Any complex or CPU intensive tasks should be left to custom built C++ programs.

These utilities are accessed via the binary program cedutils. You access its functions by calling it via a symlink to execute a single command. Because all commands really exist in a central program, it is also possible to do multiple different operations within the same line by using the argument "-com" followed by the new command. The advantage of this is in terms of speed and fewer file operations. When you call 2 commands separately in two script lines, you action a load and a save operation twice. However, with multiple commands in a single line you only have a single load and a single save operation which saves CPU cycles, and involves fewer file read/write operations. Here is an example which flips a sheet and inserts 10 empty rows:

  cedflip input.tsv -com insert -total 10 "" -o output.tsv

Compare this to the two operations done separately:

  cedflip input.tsv -o output.tsv
  cedinsert -total 10 output.tsv -o output.tsv

More information can be obtained by consulting the man page:

  man cedutils

5.5 mtcedcli

mtCedCLI is a program that provides spreadsheet functionality via a Command Line Interface (CLI). It offers exactly the same functions as the GUI based mtCellEdit, but instead uses text commands.

Both GUI's and CLI's have value, and allow particular jobs to be done more efficiently. For example I use mtCedCLI as part of the testing phase before I release the whole mtCellEdit suite. I have created a number of tests in the form of scripts which test each part of the core libraries. This exposes any problems that may have crept into the code, and is a good form of regression testing.

5.5.1 Commands

mtCedCLI reads instructions from the user via GNU Readline. This offers useful interactive facilities like using the up and down arrows to access previous commands.

It is also possible to run a script from the command line like these examples:

  cat script.txt | mtcedcli
  mtcedcli < script.txt
  mtcedcli < script.txt > log.txt 2>&1

Here is the complete list of commands that mtCedCLI accepts:

Command Arguments Notes
about About the program
clear content Clear cell content in current selection
clear prefs Clear cell preferences in current selection
clear Clear cells in current selection
clip flip_h Horizontal flip of clipboard
clip flip_v Vertical flip of clipboard
clip load <OS FILENAME> Load clipboard from a file
clip save <OS FILENAME> Save the current clipboard to this file
clip rotate_a Rotate the clipboard anticlockwise by 90 degrees
clip rotate_c Rotate the clipboard clockwise by 90 degrees
clip transpose Transpose the clipboard
copy output Copy the output of the currently selected cells to the clipboard
copy values Copy the values of the currently selected cells to the clipboard
copy Copy the currently selected cells to the clipboard
cut Copy the currently selected cells to the clipboard and clear the cells
delete column Delete the currently selected column(s)
delete graph Delete the currently active graph
delete row Delete the currently selected row(s)
delete sheet Delete the currently active sheet
duplicate sheet Duplicate the currently active sheet
export graph <OS FILENAME> Export the current graph to this file
export output graph <OS FILENAME> <FILETYPE> Export the current graph output to this file and type (as per 5.5.3)
export output sheet <OS FILENAME> <FILETYPE> Export the current sheet output to this file and type (as per 5.5.4)
export sheet <OS FILENAME> <FILETYPE> Export the current sheet to this file and type (as per 5.5.5)
find <TEXT> [wild] [case] [value] [all] Find cells with this text (possibly using the current selection)
help [ARG]... Display help on this command
import book <OS FILENAME> Import a book from a file into the current book
import graph <GRAPH NAME> <OS FILENAME> Import a file as a graph and give it this name
info Display current information
insert column [clip] Insert column(s) according to the current selection (or clipboard width)
insert row [clip] Insert row(s) according to the current selection (or clipboard height)
list files List files in the active book
list graphs List graphs in the active book
list sheets List sheets in the active book
load <OS FILENAME> csv **|** Load a new book
new book Destroy the current book and start a new book with a sheet called "Sheet 1"
new sheet Add a new empty sheet to the book
new Destroy the current book and start a new book with a sheet called "Sheet 1"
paste content Paste clipboard contents onto the current selection
paste prefs Paste clipboard preferences onto the current selection
paste Paste clipboard onto the current selection
print cell num Print the cell number for all the selected cells
print cell text Print the cell text for all the selected cells
print cell type Print the cell type for all the selected cells
print prefs book Print the book preferences
print prefs cell Print the cell preferences in all the selected cells (that are not default)
print prefs sheet Print sheet preferences
print prefs state Print state preferences
print Print the cell output for all the selected cells
q Quit the program
quit Quit the program
recalc book Recalculate the book
recalc sheet Recalculate the sheet
recalc Recalculate the sheet
redo Redo next sheet action
rename graph <NEW NAME> Rename the current graph
rename sheet <NEW NAME> Rename the current sheet
save Save the current book
save as <OS FILENAME> [FILETYPE] Save the current book to this file (optionally changing the type as per 5.5.2)
select < all | CELLREF[:CELLREF] > Set the current cursor selection
set 2dyear [ YEAR START ] Change the currently selected cells if they contain a date with a 2 digit year
set book <INTEGER> Change the current active book (0..4)
set cell <CELL CONTENT> Set the cell text
set graph <GRAPH NAME> Set the current graph
set prefs book <KEY> <DATA> Set book preference (as per A.8.3)
set prefs cell <KEY> <DATA> Set currently selected cell preferences (as per A.8.1)
set prefs cellborder <DATA> Set currently selected cell border preferences (as per 5.5.6)
set prefs sheet <KEY> <DATA> Set the preferences for the current sheet (as per A.8.2)
set prefs state <KEY> <DATA> Set the state preferences (as per A.8.4)
set sheet <SHEET NAME> Set active sheet
set width < auto | INTEGER > Set selected columns width as integer > 0 (0=default=10), or by calculating the current maximum width
sort column <EXPRESSION> Sort selected columns by expression as per 5.5.7
sort row <EXPRESSION> Sort selected rows by expression as per 5.5.7
undo Undo last sheet action

5.5.2 Book File Types

tsv_book
tsv_value_book
ledger_book
ledger_value_book

5.5.3 Graph Output Types

eps
pdf
png
ps
svg

5.5.4 Sheet Output Types

eps
html
pdf
pdf_paged
png
ps
svg
tsv
tsv_q

5.5.5 Sheet Types

tsv
tsv_gz
tsv_noq
tsv_val
tsv_val_gz
tsv_val_noq
csv
csv_noq
csv_val
csv_val_noq
ledger
ledger_gz
ledger_val
ledger_val_gz

5.5.6 Cell Border Codes

-1 Remove all borders
-2 Thin Outside
-3 Thick Outside
-4 Double Outside
-5 Thin Top And Bottom
-6 Thick Top And Bottom
-7 Double Top And Bottom


Horizontal
0 Clear Top
1 Clear Middle
2 Clear Bottom
3 Thin Top
4 Thin Middle
5 Thin Bottom
6 Thick Top
7 Thick Middle
8 Thick Bottom
9 Double Top
10 Double Middle
11 Double Bottom


Vertical
12 Clear Left
13 Clear Centre
14 Clear Right
15 Thin Left
16 Thin Centre
17 Thin Right
18 Thick Left
19 Thick Centre
20 Thick Right
21 Double Left
22 Double Centre
23 Double Right

5.5.7 Sorting Expressions

<INTEGER> , < a | d > [c] [ , ... ]

For example, expression 1,ac,2,d sorts by row/column 1 (ascending, case sensitive), then row/column 2 (descending).

5.5.8 Shell Scripts

Because the interface to mtCedCLI is just text, this means that shell scripts can control the program.

5.5.8.1 Putting Several Sheets into a Book

DOWNLOAD FILE

  #!/bin/sh
  # cedsheetbook.sh - Puts sheet files into a single book file
  # usage: cedsheetbook.sh BOOKFILENAME SHEETFILE... | mtcedcli
  # e.g. cedsheetbook.sh ~/output.zip *.tsv *.csv | mtcedcli
  # by Mark Tyler 19th February 2013
  
  
  echo "delete sheet"
  echo "save as \"$1\""
  
  shift
  
  while [ "$1" != "" ]
  do
  	echo "import book \"$1\""
  	shift
  done
  
  echo "save"

The handy thing with this approach is that you can dry run the script first by using:

  ./cedsheetbook.sh ~/output.zip *.tsv *.csv

If you are happy with the text output then you can commit the actions with:

  ./cedsheetbook.sh ~/output.zip *.tsv *.csv | mtcedcli

5.5.8.2 Extracting Sheets From a Book

The following script does the reverse of the first script. It reads a ZIP file and extracts all of the sheets to TSV files in the given directory.

DOWNLOAD FILE

  #!/bin/sh
  # cedbooksplit.sh - Extracts the sheets from a book
  # usage: cedbooksplit.sh BOOKFILENAME OUPUTDIRECTORY
  # e.g. cedbooksplit.sh ~/output.zip ~/temp_sheet
  # by Mark Tyler 19th February 2013
  
  # On error exit
  set -e
  
  mkdir -p "$2"
  
  TMP_FILE="$2"/tmp.txt
  
  printf "load \"$1\"\n" > "$TMP_FILE"
  
  printf "load \"$1\"\nlist sheets" |
  mtcedcli -q |
  awk '!/^$/ && !/^mtcedcli/' |
  while read SHEET
  do
  	LOWER=$(echo "$SHEET" | tr '[:upper:]' '[:lower:]')
  
  	case "$LOWER" in
  	*csv )	TYPE=csv;;
  	* )	TYPE=tsv;;
  	esac
  
  	echo set sheet \"$SHEET\" >> "$TMP_FILE"
  	echo export sheet \"$2/$SHEET\" $TYPE >> "$TMP_FILE"
  done
  
  cat "$TMP_FILE"
  
  echo
  echo Type y and press ENTER to run this script
  
  read KEY
  
  if [ "$KEY" = "y" ]
  then
  	cat "$TMP_FILE" | mtcedcli
  fi
  
  echo

Previous | Contents | Next