Back to Contents Previous Next
User operations
1) On loading a file
When a CSV/TSV file is loaded the main window becomes active - and will look typically like the following screenshot.
The window title has changed to confirm that a file is loaded and the particular file path is shown in the top left icon.
Below this is shown the number of records in the loaded file; its field separator (comma or tab); and its record terminator (LF, CR, LF+CR or CR+LF) - all defined from the first record of the file.
The number of rows of ‘field icons’ in the window will have been increased to match the number of fields-per-record - and the contents of the first record will be displayed.
Thus, in the shown example above, there are 8 records; the field separator is a comma and the record terminator is LF - and each record contains 6 fields. The first record is displayed and it is seen that its 5th field happens to be blank.
In this display, any surrounding double-quotes to a field will have been stripped. So, anything visible in this display is part of a field’s true content - which might include other double-quotes, commas, etc.
Purely as a convenience, which has no effect on the output produced, the contents of the other records can be examined by using the nudger icons or by entering a record number into the writable icon in the usual way.
Whenever a different file is loaded the number of rows of ‘field icons’ will be automatically changed appropriately - and the window size adjusted accordingly, if necessary.
Finally, on loading a file, you will notice that the group of icons concerned with saving a new output-file (at the top right of the window) are still all disabled at this stage.
2) Constructing a new output file in the main window
A new output-file construction is based on the currently loaded file and its structure is determined by making entries (in the main window) into the rows of ‘field icons’ to the right of the icons showing the field contents.
In each of the field rows there are seven icons capable of affecting the output structure. These icons are colour coded and operate differently:
a) The column of orange icons (with “Order” shown at the top).
This is the only column in which
at least one entry must be made in order to cause the group of ‘save icons’ to be enabled
(at the top right of the main window)
.
This orange column works by clicking <select> or <adjust> over an icon to designate the required order in the output file of the input-file fields. The first click puts “1” in the clicked icon: the second click puts “2” and so on.
A click on an icon already filled will do nothing. So the only way to change any existing settings in this column is to hit the ‘Clear’ button at the top of the column and start again.
You are not obliged to choose more than one input-file field in this column. Each one selected represents one field in the output-file - so if you only choose 1 field in this orange column there will only be one field-per-record in the output file. Simply choose the fields you want - in the order you want them to appear - and leave the rest blank. (You can, of course, choose them all.)
An example will make the above operation clearer:
i) Assume that you have loaded a CSV file with six fields per record - as in the earlier screen-shot. The display will show initially each field of the first record - and there will be six blank orange icons in a vertical column.
ii) Let’s assume you want to produce an output file by extracting only three of the fields from the input file, and you want the first field of the output file to be Field #6 from the input file; the second output field to be Field #3 from the input file; and the third output field to be Field #1 from the input file.
iii) Make your first click in the orange icon in the Field #6 row - a 1 appears in it.
(And notice that the ‘save icons’ in the top right panel become enabled.)
iv) Make your second click in the orange icon in the Field #3 row - a 2 appears in it. And your third click in the orange icon in the Field #1 row - a 3 appears in it.
v) Leave the remaining orange icons untouched i.e. Fields 2, 4 & 5 of the input file will be ignored.
You can now proceed to make further constructions as described below.
b) The two columns of grey icons (with “Join to” shown at the top)
These two columns allow you to join (concatenate, string-wise) - in the output-file - up to two other input-file fields to the corresponding (orange column) input-file field.
Entries are again made by clicking over the icons but the action is different this time. The first click with <select> over any grey icon will enter “1” and the second <select> click over the same icon will produce “2” and so on. Clicking with <adjust> will reduce the number by 1. The allowed range is automatically constrained to the number of fields in the input-file (i.e. number of rows displayed). Thus you can choose any input-file field any number of times in these grey columns and you can alter the choice at will. (The ‘Clear’ buttons will again clear all entries in their respective columns.)
Thus, if you have chosen input-file Field #6 to be the first field of the output-file (i.e. you will have put “1” in the orange icon of the “Field #6” row) and you now enter 3 and 1 respectively in the two grey icons of this same row - then the output-file will have
all the first fields of its records
made up of a concatenation of input-file fields 6+3+1.
(If you wish to insert a blank space - or any other text - between these fields then you will need to use the intervening white writable icons, as below.)
c) The white writable icons
These work in the usual way and simply give you a number of options for inserting your own choices of text/characters in various places in the output fields. Each white icon will take up to 32 characters.
On any one row you can use as many or as few of the four writable icons as you wish - even if you make no selection in the grey ‘Join to’ icons. Thus, if no ‘Join to’ choice is made, the layout allows you to have up to 32 characters before the field in that row and up to 96 characters after it. More characters are possible by doing the job in two stages - remembering that any single field (including any surrounding double-quotes) must not exceed 255 characters.
The writable icons have the added facility of allowing you to copy an entry from any of these white icons into any of the others. <ctrl-C> copies an existing entry and <ctrl-V> pastes it.
d) Selecting individual records
In addition, there is the further option of selecting the individual input-file records which you wish to appear in the output file - modified by the construction, of course. This is conrolled by the small panel to the left of the window, above the field contents.
By default, when you load a new input file all records will be selected for output.
(This can be confirmed by stepping through the records and noting that the ‘Include shown record in output’ option is always ticked.)
Any record can be selected/de-selected by stepping to it and ticking/unticking the option.
To make things easier, the ‘Select all’ and ‘De-select all’ buttons can be used to set the most convenient start point for your selection/de-selection needs.
3) Saving a constructed output file
When you are happy with your ‘field icon’ entries and your record selecting/de-selecting, it is time to address the saving options at the top right of the window. They are pretty straightforward.
You are offered choices of:
- record terminator, in the left-hand column of radio icons;
- field separator, either comma or tab, in the middle column of radio icons;
- whether or not to enclose each field with double-quotes.
(Note that, irrespective of the choice made here, any individual output field content found to include a comma (ASCII 44) or double-quote (ASCII 34) will automatically be enclosed with double quotes in the resulting output-file.)
Having made your choices, press the ‘Save new file’ button and a save window will open. This will allow you, finally, to choose between a ‘CSV’ (&dfe) or ‘Text’ (&fff) filetype for your output file. This choice does not alter the file contents in any way and you can always use the Filer menu to change the filetype again later if you wish.
Drag the file icon to your destination directory or press ‘OK’ in the usual way. All the usual warnings will occur if the writable icon is not sufficiently filled. You will also be warned if the new file would overwrite an existing file.
Your saved output file will contain a newly constructed record for each of the selected records of the input file.
If your particular construction would result in any of the fields in the intended output exceeding 255 characters, then a ‘non fatal’ error message will appear and it will indicate which field in which record is causing the problem. The output process will be halted but your construction settings will still be in place, to allow you to make alterations and try again.
Finally, if any of the selected input-file records have more fields or less fields than its defining first field then the processing will eliminate these differences in the output file. That is, empty fields will be added to replace missing fields - and excess fields will be removed. Thus all records of the output file will have the same number of fields
4) Saving the construction settings
Setting the ‘field icons’ to make your output file construction might have taken you some time and so, for convenience, you can save the settings - before or after you have saved a new output file based on them - by pressing the ‘Save settings’ button. This will bring up a conventional Save box with a special file icon (of file-type &005) which will allow you to save the current construction settings to a place of your choice. (This action does not save the ‘save icons’ settings nor the input-file record selections, as these may need to be varied for the same construction settings.)
The saved file will have the following icon:
Note that any saved settings file is matched to the ‘size’ (number of fields) of the input CSV file that was loaded at the time the settings were saved - see below for the implications of this.
5) Loading a construction settings file (type &005)
You can load a settings file by dragging it to the main window - provided that a CSV file is already loaded. (You cannot load a settings file by double-clicking on it, nor by dragging it to the iconbar icon.)
If the dragged settings file is ‘larger’ (has more rows/fields) than the currently-loaded CSV file, then a warning box will appear and the settings file will not be loaded.
If the the dragged settings file is the same size as the currently-loaded CSV file then loading will take place and any current contents of the ‘field icons’ will be replaced by their corresponding settings file values.
If the dragged settings file is smaller than the currently-loaded CSV file then a warning will appear but the file will be loaded as follows:
- any current settings in the first N rows of ‘field icons’ will be completely replaced by the corresponding values in the settings file (where N is the number of rows saved in the settings file); and
- any current settings in the ‘Order’ column below row N will be cleared; and
- any current settings in the other ‘field icons’ below row N will remain untouched.
(Note that loading a settings file will only enable the ‘save icons’ panel if the settings include entries in the ‘Order’ column.)
6) Adding extra fields
Blank fields can be added a loaded input file, and it is a process which is completely separate from the operations described above.
After an input file has been loaded, pressing the button “Insert new fields only...” brings up a small window similar to the following screen-shot.
As with the main window, the vertical size of this window and the number of icons it will have depends on the loaded CSV/TSV file. In the case shown in the above screen-shot the loaded file has 6 fields per record. The top option icon is to add a new field before the existing first field, and this is followed by a number of option icons to add new fields after each of the existing fields - here, 6 such option icons, giving a total of 7 option icons.
The user simply ticks whichever icons required and then clicks the “Save new file...” button to open a conventional save window.
The resulting new file will be a copy of the loaded input file with extra blank fields inserted at the chosen place(s). The separator and terminator will be the same as the loaded file and any surrounding double-quotes will be preserved (but the new blank fields will not have surrounding double-quotes).
The originally loaded file will not be altered in any way and will still be loaded. So if subsequent manipulation of the new file is required it will need to be loaded as a new file.
If the input file has any records with less fields than the first record (i.e. the defining record) then sufficient extra field separators will be inserted automatically into the output file for such records to ensure that the number of fields is the same as in the defining record - before processing the user-chosen additional fields.
Conversely, if the input file has any records with more fields than the first record then the additional fields will be preserved in the output by copying them as trailing fields after processing the user-chosen additional fields.
7) Splitting a field into two
Any field in a loaded input file can be split into two fields, and it is a process which is completely separate from the operations described above.
After an input file has been loaded, pressing the button “Split a field only...” brings up a small window similar to the following screen-shot.
At the top, the user can choose which field (in every record of the loaded file) is to be split.
Then the window offers the following choices for how to split this field:
- Split after/Split before a fixed number of characters in the field.
If ‘Split after’ is chosen, the split takes place after the chosen number of characters from the Left e.g. a field “ABCDEF” would be split as “ABCD” and “EF” if the value was 4.
(If there are less characters than the chosen number - or the same number of characters - then an empty new field is placed after the unaltered original field.)
If ‘Split before’ is chosen, the split takes place counting the characters from the Right e.g. a field “ABCDEF” would be split as “AB” and “CDEF” if the value was 4.
(If there are less characters than the chosen number - or the same number of characters - then an empty new field is placed before the unaltered original field.) (Also, in the trailing case, any trailing blank spaces are stripped before applying the rules.)
- Split after/Split before a specified character in the field - with the option to remove the chosen character. The character is case sensitive.
If ‘Split after’ is chosen, the split takes place after the
first
occurrence of the chosen character counting from the Left e.g. a field “ABDCDF” would be split as “ABD” and “CDF” if the character was D and the remove option is not chosen.
(If the character is not found then an empty new field is placed after the unaltered original field.)
If ‘Split before’ is chosen, the split takes place at the
first
occurrence of the character counting from the Right e.g. a field “ABDCDF” would be split as “ABCD” and “DF” if the character was D and the remove option is not chosen.
(If the character is not found then an empty new field is placed before the unaltered original field.) (Also, in the trailing case, any trailing blank spaces are stripped before applying the rules.)
- Split after leading/Split before trailing numbers.
If ‘Split after leading’ is chosen, the split takes pl
ace after the end of a continuous
leading
run of the characters
0-9 e.g. a field “1234ABDCDF” would be split as “1234” and “ABDCDF”. In addition, if the leading run will be extended if the character “.” or “/” or “-” is surrounded by these number characters. Similarly, if a leading run of the number characters includes bracketed numbers then the brackets will be treated as numbers. E.g. “123-4ABC” will be split as “123-4” and “ABC”; and “1234(66)7ABC” will be split as “1234(66)7”
and “ABC”. However, “123-ABC” will be split as “123” and “-ABC”
If ‘Split before trailing’ is chosen, similar rules apply but counting from the Right for a trailing
continuous run of the characters
0-9.
(Also, in the trailing case, any trailing blank spaces are stripped before applying the rules.)
- Split after leading/Split before trailing letters.
If ‘Split after leading’ is chosen, the split takes pl
ace after the end of a continuous
leading
run of the characters
A-Z/a-z e.g. a field “ABDCDF1234” would be split as “ABDCDF” and “1234”. In addition, if the leading run will be extended if the character “/” or “-” is surrounded by these letter characters. Similarly, if a leading run of the characters includes bracketed letters then the brackets will be treated as letters, with similar results as in the above number case.
If ‘Split before trailing’ is chosen, similar rules apply but counting from the Right for a trailing
continuous run of the letter characters
.
(Also, in the trailing case, any trailing blank spaces are stripped before applying the rules.)
- Split after leading/Split before trailing non-digits.
If ‘Split after leading’ is chosen, the split takes pl
ace after the end of a continuous
leading
run of any characters
except 0-9 e.g. a field “$.Pics.Holidays.view1234” would be split as “$.Pics.Holidays.view” and “1234”. There is no special treatment of the characters “/”, “-” or brackets.
If ‘Split before trailing’ is chosen, similar rules apply but counting from the Right for a trailing
continuous run of the non-digit characters
.
(Also, in the trailing case, any trailing blank spaces are stripped before applying the rules.)
After making the choices, pressing the ‘Save new file ...’ button brings up a small save window which allows the final choice of file-typing the output file as Text or CSV - and it will always reflect the input file-type at first.
When saved the output file will be a new file, leaving the input file unaltered and ignoring any settings which might be present in the main window. The field separator and record terminator of the new file will be exactly the same as in the input file.
If further manipulation of the new file is required it will have to be loaded into !CSVamp in the usual way - but remember that the new file will have one more field than the input file which leads to the possibility that the new file might exceed the current setting of the maximum number of fields per record - see later. Also, depending on the input file, the same output can sometimes be achieved using different options.
By using the splitting options in cascade with other !CSVamp facilities a huge range of manipulations can be achieved painlessly.
8) Sorting
Simple sorting of text can be carried out on a loaded input file, and it is a process which is completely separate from the operations described above.
After an input file has been loaded, pressing the button ‘Sort only ...’ brings up a small window similar to the following screen-shot.
This window offers the following sort choices:
- Sort field: Sorting can take place on any field of the input file and is chosen by using the ‘bump’/‘nudger’ icons to the right of the Pale Yellow icon at the top of the window.
- Sort routine: Both a ‘bubble sort’ and a ‘quicksort’ routine is offered and the choice is made by clicking the corresponding radio icon. I do not know much about sorting but am aware that the speed of different sort routines can depend on how close the list already is to being sorted. So a choice of methods seemed sensible. The ‘quicksort’ option is selected initially.
- Alphabetically ‘up’ or ‘down’: The choice is made by clicking on the corresponding radio icon ‘Alpha up’ or ‘Alpha down’.
- Case sensitive: By default the alphabetical sorting will ignore the case of letters, but ticking the option icon will cause a case-sensitive sort to be done (see below for consequences).
- Ignore 1st record: If ticked, the sorting will take place on all records after the first. This is to cater for the common practice of making the first record of a CSV/TSV file indicate the description of each field e.g. the first record of a CSV address file might be:
Name,Address #1,Address#2,Address#3,Address#4,Postcode
and you would want to keep this record as the first and not include it in the sort process.
- Show dups of sort field: If ticked, after sorting has been completed any duplicated sort field contents will be identified and noted in a text-file called Duplicates (saved to the !CSVamp application folder). This file will be automatically opened. The list will be in the format:
Record 7 <sort field contents>
where the record number refers to the records in the sorted output file. (Note that the sort field contents will be shown in lower case if ‘Case sensitive?’ is not ticked - but the sorted file will have the same cases as the source file.)
If more than two identical sort field contents are found then the list will show consecutive record numbers with the same text. Note that blank fields - or fields with only spaces - will also be tested for duplication: and in such cases only the record number(s) will appear in the list.
When the choices have been made, pressing the ‘Save new sorted file’ button will open a conventional save box to allow the sorted file to be saved to whatever location you wish. The file-type offered will initially be the same as that of the loaded file, but the choice between CSV/TSV (&dfe) and Text (&fff) can be made by the user.
The saved sorted file will contain the exact contents of the input file - just differently ordered. There is no attempt (nor need, when sorting) to change the output records to compensate for too many or too few fields.
Also, the sorting process will completely ignore any settings that you may have made in the main window i.e. for constructing an output file and/or for selecting/de-selecting certain records. Therefore, if you want end up with a sorted sub-list of a certain file, it is necessary first to produce an output file of the required sub-list (using the operations in Sections 3-6 earlier) and then load this output file as a new input file for sorting.
Please note that sorting essentially puts items in the order of their ASCII codes. So, assuming an ‘Alpha up’ choice, blank fields (ASCII 0, or possibly ASCII 32) will appear at the top (lowest ASCII number is ‘highest’ alphabetically) of a sorted list. Similarly, numbers will appear before letters and will not necessarily be in numerical order. (E.g. numbers in the order 9,8,87,0,88,7 will be regarded as “9”,“8”,“87”,“0”,“88”,“7” and will be sorted as 0,7,8,87,88,9.)
However, for letters, the !CSvamp user can choose whether or not to take the case into consideration. By default, the ‘Case sensitive?’ option is unticked and hence the case of letters is ignored - so that a list in the order “car”,“Cat”,“Can” will be sorted (for ‘Alpha up’) as “Can”,“car”,“Cat”. But if ‘Case sensitive?’ is ticked then the order would be “Can”,“Cat”,“car” - because all lower-case letters have a higher ASCII code than upper-case letters.
Finally, although a pair of double-quotes surrounding a field will be ignored during sorting, further single-quotes or double-quotes within the outer double-quotes will be treated as valid text characters for sorting.
Top of page Back to Contents Previous Next