Archive for the ‘Tech Notes’ Category.

clue that you have left out a dash:

Enter host password for user 'ser':

result of ‐user someuser:somepass where it should have been ‐‐user someuser:somepass

Note to self: must name a user ’ser’ with password ‘ass’ so I can write

curl -user -pass [url]


mysql -user -pass [db] does not preserve Content-type of multipart non-file data

Ok, that sounds a bit complex, but basically if you have data such as

POST https://someurl
Content-Length: 503
Content-Type: multipart/form-data; boundary=xYzZY

Content-Disposition: form-data; name=”someFile”; filename=”somefile”
Content-Type: text/plain

lines in

Content-Disposition: form-data; name=”otherFile”; filename=”otherfile”
Content-Type: text/plain

lines in


Content-Disposition: form-data; name=”RegularData”
Content-Type: text/xml

<Parameter2>value number two</Parameter2>

and you use to process it, there is NO WHERE in the query object returned by CGI->new that stores the fact that RegularData is Content-type: text/xml. You can see this in the code here:

if ( ( !defined($filename) || $filename eq '' ) && !$multipart ) {
my($value) = $buffer->readBody;
$value .= $TAINTED;

The only place that knew about the Content-type: text/xml was in %header, a local variable that goes out of scope when we go to the next parameter.

Not a huge deal, but sometimes it matters...could patch, use some other method of parsing the multipart data, or guess the format by inspection...probably I'll be lazy and inspect the data.

This is for version 3.51 and 3.60

p not div

quick SEO tip – Facebook seems to recognize <p> tags as the preferred place to quote paragraphs from, rather than taking earlier content enclosed in <div>’s. Perhaps other sites prefer p as well as an indicator of real text.

vote for Strawberry Perl

At least when requiring XML::LibXML in a Windows environment, it was much easier to get running with Strawberry Perl than with ActiveState, mainly because Strawberry Perl has libxml and libxslt included with the install. Also I like using cpan better than ppm. ppm archives for perl 5.14 do not seem complete, and ActiveState will not give you an earlier perl in the community edition.

So, while I appreciate the contributions of both maintainers, the most painless path is the one to take…and in this case that was Strawberry Perl 5.14 with a cpan install of some other required modules (XML::LibXML was already installed).

Ctrl-Alt-Del for VirtualBox on MacOS

Ok, not a showstopper, but….running Windows in VirtualBox for MacOS, and had to press Ctrl-Alt-Del to start. There is an Alt key on my MacBookPro, but its fn-option and that didn’t work. The answer turned out to be under the VirtualBoxVM menu Machine->Insert Ctrl-Alt-Del. There might be other ways as well, but that was enough to get it going…

hadoop fs is space-sensitive

HDFS, high density file system, is useful for big data. However, hadoop fs is not quite there as a shell replacement. Today I kept getting the message

cp: When copying multiple files, destination should be a directory.

when trying to copy multiple files to a directory using

hadoop fs -cp /path/to/files/*  /path/to/destination/directory

Finally figured out that the problem was I had two spaces between the file list and the directory path, which made hadoop not see the directory path in the command. Aaahh.

don’t try creating gdbm file on an nfs mount

gems/gdbm-1.2/lib/gdbm.rb:256:in `initialize': Empty database (GDBMError)

error occurs when trying to use

g ='somefile')

on an nfs-mounted partition. GDBM works fine on normal drives, just don’t try it on nfs-mounts. Posting this as I found nothing when I googled the error message, and wasted several minutes before I realized the problem. The error message may be specific to the ruby ‘gdbm’ gem, but the rule is a general one.

Wordpress debug notes

Note: I’m not a wordpress expert, just returning to it after several years without having touched PHP – and looking for the best way to quickly understand the flow of a wordpress site using buddypress and a few other plugins. Raw notes here, will be annotated as I progress…

Disable spell-check in chrome

Spell checking and typeahead are two of my top gripes with modern software. URL or bookmark completion is ok, that is when I’m in the ‘trying to remember’ mode. But when I’m in the flow of writing, having the computer guess what I’m trying to say is incredibly distracting and annoying.

Originally, I thought gmail was running auto-spellcheck for me, but it was the browser, in this case Google Chrome.

In Chrome, you turn off spellcheck under chrome://settings/language ; uncheck the ‘Enable spell checking’ box underneath the list of languages.

(You can also get to this screen advanced settings screen by clicking on the wrench in the upper-right, select ‘Preferences’ and ‘Under the Hood’, then click ‘Languages and Spell-checker Settings’ )

Fix for CreateQualificationType returning 400 error using rturk

Update 7/14/11 – I believe this is fixed in rturk 2.4 – thanks Mark!
rturk, the Ruby gem for making calls to the Amazon Mechanical Turk API, uses a REST transport layer. That’s fine, but all calls are currently performed by a GET, which has a length limitation. When making calls that include long strings of data – such as the XML for a QuestionForm structure in a qualification tests – errors may occur with the non-explanatory message ‘400 Request Error’.

Was able to patch it by making a change to lib/rturk/requester.rb :

< RTurk.logger.debug "Sending request:\n\t #{}?#{querystring}"
< RestClient.get("#{}?#{querystring}")
> # RTurk.logger.debug “Sending request:\n\t #{}?#{querystring}”
> # RestClient.get(”#{}?#{querystring}”)
> RTurk.logger.debug “Posting request to #{}:\n\t #{params.inspect}”
>, post_params)

A more robust fix might be to use POST only for longer requests, or make it an explicit option on the RTurk object