Optimizely Module: Notes on Converting to Drupal 8: January 2016

Jan 15, 2016

jQuery.get() Ajax call fails on localhost url

While going through the book Learning jQuery, 4th Edition, I've been doing the exercises by placing and editing code files in a local directory on the desktop and opening index.html as a file in Firefox.

This worked fine until I had to code an Ajax call that referred to a php file. This required the use of a web server, so I created a subdirectory under the local server root and copied the php file into the subdir.

Keeping all other code on the desktop, I then edited the Ajax call so that it requested from the url

http://localhost/learning-jquery/e.php

The call silently failed and returned nothing. Console windows would show nothing, no errors.

Eventually, I found out about the same-origin policy. In brief, this policy means that by default JavaScript is prevented from making requests across domain boundaries, where same-origin means that the protocol, hostname, and port number must be identical.

So I moved all code files into the subdir under the server root. Working from there rather than the desktop, the Ajax calls then succeeded.

(And I changed the requested url back to just e.php)

   ***

But what about carrying out cross-origin access? One possible way is to configure the server to allow it. For Apache on my Linux system, I added the following directive in /etc/apache2/apache2.conf

<Directory /var/www/html/learning-jquery>
Header set Access-Control-Allow-Origin "*"
</Directory>

However, checking this configuration change with the command

# apachectl -t

resulted in

AH00526: Syntax error on line 205 of /etc/apache2/apache2.conf:
Invalid command 'Header', perhaps misspelled or defined by a module not included in the server configuration
Action '-t' failed.
The Apache error log may have more information.

It turns out that I also had to enable a module named headers

# a2enmod headers

Then restart the server. That enabled cross-origin access.

Not yet satisfied, I was wondering if the wildcard "*" in the directive could be replaced by something more restrictive. For the way I'm developing locally, the following also works because opening a file via the file:// protocol results in an origin of null.

<Directory /var/www/html/learning-jquery>
Header set Access-Control-Allow-Origin "null"
</Directory>

   ***

The fact that the calls had failed silently was very troubling. One way to avoid this is to register a global Ajax error handler by calling ajaxError(). Here's a handler that simply puts up an alert box. (I'm using jQuery 1.9)

$(document).ready(
function setAjaxErrorHandler() {
    $(document).ajaxError(
      function alertError(event, jqxhr, settings, thrownErr) {
        alert('Ajax error handler: ' + jqxhr.status
                + ' ' + jqxhr.statusText);
      }
    );
}
);

For the cross-origin error, this puts up the message "Ajax error handler: 0 error", which is not exactly super helpful but better than nothing.

Another way is to attach an error handler to the particular request by using the fail() method. For example,

$.get('http://localhost/learning-jquery/e.php', { ... },
       function (data) { ... })
    .fail(function (jqxhr) {
        alert('GET error: ' + jqxhr.status + ' '
                + jqxhr.statusText);
    });

Sources:

jQuery.get()
https://api.jquery.com/jQuery.get/

Same-origin policy
https://en.wikipedia.org/wiki/Same-origin_policy

Why is CORS important?
http://enable-cors.org/

CORS on Apache
http://enable-cors.org/server_apache.html

Configure Apache To Accept Cross-Site XMLHttpRequests on Ubuntu
https://harthur.wordpress.com/2009/10/15/configure-apache-to-accept-cross-site-xmlhttprequests-on-ubuntu/

Jan 7, 2016

How to Extract Columns from a CSV file (without using a spreadsheet)

One of my colleagues who does data analysis had a CSV (comma-separated values) file that he could not open in Excel nor in any other spreadsheet program he tried.

Either due to its sheer filesize of 125MB, or its number of rows of more than 1,500,000, the programs would gag.

It turned out that for his purposes, he did not need all of the data, only a subset of the columns. Maybe extracting only what he needed into a smaller CSV would enable him to be able to work with it.

I had earlier read parts of the book The Linux Command Line, by William Shotts. I vaguely remembered mention of a utility to selectively pull fields out of a text file. That turned out to be the cut command.

Here's what the first few rows of the original data file looked like.

id,type,distance,userid,charityID,time,lat,lon
1003529743,walk,4.48342,1000545086,2166731,"2015-06-30 00:00:05",40.2501,-76.6714
1003529744,Run,4.087,1000402641,15048,"2015-06-30 00:00:21",45.5244,-89.7398
1003529745,run,2.631135,1000258381,61635018,"2015-06-30 00:00:23",41.6281,-87.193
1003529746,Bike,1.216703,1000505816,18010,"2015-06-30 00:00:24",43.0306,-78.7963
1003529747,walk,2.069664,1000015957,18010,"2015-06-30 00:00:25",39.2481,-76.5165
1003529748,Bike,6.174,1000126350,18010,"2015-06-30 00:00:25",29.5913,-82.4298
1003529749,run,1.47652,1000542869,92985044,"2015-06-30 00:00:26",40.7115,-89.4287

Only the type and userid columns were actually required. Using the following command I was able to generate another CSV with just those two fields.

$ cut -f 2,4 -d, july.csv > type-userid.csv

The -f option specifies which fields to extract. In this case, its the 2nd and the 4th fields. The -d option is the field delimiting character, which in our case is the comma. It defaults to tabs.

july.csv is the input file, type-userid.csv captures the standard-out.

The first few rows of the resulting file were

type,userid
walk,1000545086
Run,1000402641
run,1000258381
Bike,1000505816
walk,1000015957
Bike,1000126350
run,1000542869

And the filesize was reduced to about 24MB, which was usable and much more manageable.

(However, depending on which spreadsheet app you are using, the row count might exceed the limit. For example, both Excel 2007 and LibreOffice 4.2 Calc can handle a maximum of 1,048,576 rows.)

***

The cut command works blazingly fast. It only took about a second to process the input file. After I handed off the outputted file, I later realized that I might have been able to save my colleague some tedium and waiting time by applying other CLI text-processing commands as well, such as sort, uniq, and wc, depending on what he wanted to do.

Source:

The Linux Command Line, by William Shotts
http://linuxcommand.org/tlcl.php
(You can buy the paper book or download the pdf for free)