Friday, June 27, 2014

DataBasin: more powerful CSV writing for Select and Select-Identify

DataBasin, the clean-room impelmentation of a data extraction tool for SalesForce.com available for GNUstep and Mac just made a big leap forward!

If you have ever done a SELECT in salesforce.com, you might have noticed that the results are not ordered and that semi-joined fields ("." notation) are handled strange because a whole object is returned.

DataBasin extracts the  columns of the CSV file by checking the first row of the dataset and recursing on the field names of salesforce's response. The effects in DataBasin are the following:
  • The field order is not preserved
  • If a sub-object is queried, all those fields are grouped together in that object
  • If more than one field is queried on a sub-object and on certain record this object is missing, the number of columns between records is inconsistent: salesforce.com doesn't return the whole object and DataBasin iterates record after record in the sub-object, but if it it is totally missing, only one empty column will be written, not as many as needed
  • Capitalization of the column names doesn't get preserved (compared to the above issues, something really minor)

I added a new option available in write out fields in the order of the query.

This feature implies a more complex under the hood. For each record gotten in the response, the field names are reconstructed as before (e.g. by recursing inside sub-objects) and put in a Dictionary.

The original SOQL query is parsed and the field names extracted, then these names are used as keys for each record dictionary and written out.
  • Order preserved, also with sub-objects
  • Fields preserved and empty values are written even if whole sub-objects are missing
  • Case is preserved
This approach looks fine and comes with a relatively small performance penalty, however it is delicate because during parsing of the query, many naming subtleties of Salesforce.com needs to be inferred, that is, DataBasin tries to infer how the name in the Response will be:
  • field aliases
  • aggregate names with no alias, which are called progressively Expr0, Expr1
  • idiosyncrasies with aggregate functions. A field in a sub-objects is called Object__c.Field__c when in a regular query, but when used in an aggregate query, only Field__c is used, disregarding the join-depth
  • COUNT() vs.COUNT(Id)
I hope I have not missed any! in any case, enjoy!



This feature is available now both in Select and Select-Identify, since it is a problem of the CSV Write backend and enhances DataBasin operation usability greatly, bringing it on par with DataLoader.
Since DataLoader performces similar transformation and sometimes produces wrong results, I left this feature as optional with the flag "Write fields in query order".

Sunday, May 04, 2014

ProjectCenter build parser fixes

Debugging build tools...

I was long plagued GNUstep's IDE ProjectCenter had problems with parsing the compiler's output. This made "clicking" on the warning or error often impossible. I never dug into the details, but it happened more and more often and was worse on different systems than others.

Yesterday, while working on another projects I got so annoyed that I debug until deep into the night to fix them and found two different sources:
  • On all OS recent versions of gcc (gcc 4.x) have a "fatal error" that was not being correctly parsed and assimilated to an error
  • OpenBSD was using egcc and this was causing the build files not to be recognized, so the error was parsed, but opening the correct editor was impossible
I fixed both! Using modern gcc is thus now fine as using OpenBSD that works now like Linux. GNUstep's cross-platform support thus continues!

Happy hacking.

Wednesday, April 30, 2014

DataBasin 0.7

DataBasin 0.7 is out!

If you need to query, update,  create data on SalesForce.com and are on a GNU/Linux, *BSD, Solaris system or MacOS-X... try it out! Do you need to perform a query having objects or IDs instead of a where clause? Select-identify will come handy for you.

Some of the news:
  • Save results of update and create in a results.csv file with succeess and failure
  • Progress monitor of operations shows time remaining 
  • The login panel shows a visual icon if login happened successfully or not
  • Progress monitor for Query, Delete, Insert
  • Select identify now supports ORDER BY
  • Internal core enhanced for Delete
  • Many bug fixes and clean-ups

Thursday, April 03, 2014

Graphs: Improved Grid and labels

Lots of new stuff in OresmeKit, the graphing toolkit for GNUstep and Mac! As an Example, an advanced dashboard based on DataBasin that displays the system load of Salesforce.com. It is not generally available yet, but I hope it will be!

Grid-sizing is now selectable so it gets spaced in 1K or 1M intervals (depending on the data-range available), like it is used in both screenshots in this example.

 - (void)setYAxisGridSizing:(OKGridSizing)sizing;

Can take now: OKGridConstantSize, OKGridKiloMega

Also, one can decide to draw Just the label of the minimum and maximum value or a label for every grid:

- (void)setYAxisLabelStyle:(OKLabelStyle)style;

Can take:  OKNoLabels, OKMinMaxLabels, OKAllLabels

1000-unit Grid



To complement this kind of visualization, a new kind of Label formatting can be used. In the example above, the numbers are plain, 10.000 is written as such, in the example below, it is formatted as 10K, if we were using 10.000.000, it would me 10M

1000 - grid with K formatting

Monday, February 17, 2014

Tailoring OpenBSD for an old strange computer

I have an ol' OmniBook 800CT. A small, interesting computer, for its time, extremely advanced!
Small form factor, but still a very nice keyboard, something unmatched on modern netbooks. The unique pop-out mouse. The series started out with 386 processor, b&w display and ROM expansions.
The 800CT is one of the latest models: same form factor, SCSI connector, but color screen (800x600) and a hefty Pentium 133Mhz!
But only 32 MB of ram (the kernel report 31 of real mem, 24 avail mem)

Original 5.4 kernel: 9.2M
Custom kernel: 5.0 M

This shrinkage is quite hefty! almost 50%! More than raw disk usage, this new kernel boots faster and leaves  more free memory. Enough more that X11 is now almost usable

How can this be achieved? essentially by removing unused kernel options. If you remove drivers which you know you don't need because you don't have the hardware (and won't use it, e.g. you know you won't plug-in a certain card in the future) then you configure it out, it won't be built and it won't get in your kernel.
On an old laptop with no expansion except the ports and the PCMCIA port it has, this is relatively easy.

To build your custom kernel, follow the OpenBSD FAQ.

The main theory is to take the kernel configuration file, skim over it line by line it and see if you have the hardware, which you know by checking your dmesg. Dmesg shows which devices and drivers were loaded.Remember that you do not modify GENERIC, but a copy of it.

You can automate this with a tool called dmassage: it will parse your GENERIC configuration and produce an optimal tuned version, however it will not work out of the box.
Why? there are drivers which do not compile if other drivers are not present.

I'm unsure if this is really a bug, in my opinion it is at least "unclean" code, however since mostly this kind of extreme driver-picking is not done, it is not fatal and probably won't be fixed.

 If you remove all drivers at once, you won't easily find out one which one breaks, so my suggestion is to remove them in sets. One by-one is surely too tedious, since for each you need to make a build.
  1. remove X drivers
  2. build, if it works, copy the configuration file as a backup
  3. test the kernel, optionally, by booting it
  4. continue removal

Thus, in case of breakage, you can narrow it down to a less options.

If your mahcine doesn't have a certain bus, you may remove all drievrs attached to each. But proceed from the leaves, not the trunk: gradually remove the peripheral drivers before removing the bus support.

In my case, I found that an unremovable driver is:
et*    at pci?                # Agere/LSI ET1310


Remember that you are running an unsupported kernel, if you want support for a problem, better try it with the original kernel, of which you should anyway for safety retain a backup copy during the iterative building process.

Addition:

In X11, which needs to be set to 800x600 8-bit mode, I had to uncomment these lines:
    #Option "progLcdModeRegs" "true"
    #Option "progLcdModeStretch" "true"

Friday, February 14, 2014

GAP moves to SVN

The GNUstep Application Project transitioned its source repository from CVS to SVN.

The sources were migrated with full-history and the same directory structure.

Connections detail, at the project's page about svn: https://savannah.nongnu.org/svn/?group=gap

Browsing can be done here: SVN Surfing

The CVS repsoitory will remain open for some time, in case something went wrong during the migration process, but it is not official or current anymore.

Thursday, December 12, 2013

GWorkspace, how to easily mount/unmount volumes

GWorkspace allows for easy mounting and unmounting of volumes (disks, usb sticks, CD-ROMs) like you would do manually on console, without the requirement of daemons or other tools.

Here is how.

First, be sure your underlying operating system is correctly configured to allow you to do that. I'll show as example how I configured my Debian system.

Your current user needs to have the permissions to mount the volume(s), this is usually done by being part of the correct group. For Debian we find cdrom and floppy as ready groups

cdrom:x:24:multix

The predefined mount-point for the cdrom is /mnt/cdrom, since I usually handle with USB sticks I added a mount point for it:

$ ls -l /media/
total 8
lrwxrwxrwx 1 root root    6 Jan  1  1970 cdrom -> cdrom0
drwxr-xr-x 2 root root 4096 Jan  1  1970 cdrom0
drwxr-xr-x 2 root root 4096 Nov 27 14:01 usb


Now, let's add the correct lines to /etc/fstab for our usb entry. The cdrom entry is usually already configured by Debian.

/dev/hdc        /media/cdrom0   udf,iso9660 user,noauto     0       0
/dev/sdb        /media/usb      auto    defaults,users  0       0


I auto shall determine the file system automagically, while users allows for user permissions for the mounted files. You can check if everything is configured fine if you can "mount" from the command line with your current user.

Then open SystemPreferences and configure the managed paths as shown here:

SystemPreferences - Mount points
Configure the mount points for removable media, here I have enabled the cdrom and the usb mount points. The mtab path is currently ignored by GWorkspace, so don't bother and the default value is anyway correct for Linux.

Now, in GWorkspace we can select Check for disks under the Tools menu.
GWorkspace - check for disks
If everything goes well, voilĂ , your volume will get mounted. It will appear on the destkop if you have the Desktop enabled and you can show its content. In any case it shall be visible in the File Viewer in its mount place.

Usb stick contents




If all options are set correctly as suggested, the volume is readable, writable as you can check with the Attributes Inspector. Also we can open the .zip file with Zipper easily.

Zipper and Attributes


To unmount a volume, just drag its icon to the Recycle Bin.

GWorkspace, fixed mount bug on Linux

Fixed a nasty bug (I was using the wrong macro) that was causing GWorkspace not to mount removable volumes (like an USB key) on Linux.