Using SUBSTR on the left of an assignment in SAS

This is a pretty cool feature, and as far as I know it’s unique to SAS and Perl. In SAS, you can have a substr to the left of an assignment.

stringDate='29MAR1984';
substr(stringDate,3,3)='FEB';

This avoids potentially awkward situations like this:

stringDate='29MAR1984';
stringDate=substr(stringDate,1,2)||'FEB'||substr(stringDate,6,4);

The bounds don’t have to be static. Lets suppose we have a dataset with 1 row per letter, and we want to combine those letters into a word.

data have;
    input letter $1.;
    datalines;
h
e
l
l
o
;
run;

My initial thought was to just retain and append each letter:

data want;
  length word $200;
  retain word;
  set have nobs=nobs;
  call cats(word,letter);
  if(nobs=_n_) then output;
run;

However this breaks if one of the letters is a space, and SAS treats the space as null, so appending a space has no effect on the string. Instead of appending, just stick the letter where it’s supposed to be.

data want;
  length word $200;
  retain word ' ';
  set have nobs=nobs;
  substr(word,_n_,1) = letter;
  if(_n_==nobs) then output;
run;

Installing SAS 9.4 on Ubuntu 14.04

Officially SAS does not support Ubuntu Linux as a platform, in spite of it being the most popular desktop distribution. It will run, but there are two things you need to do:

First, reconfigure Ubuntu not to use dash. The SAS Deployment Wizard will freeze when you first click “Next” during the install if you don’t do this.

sudo dpkg-reconfigure dash

That being done, you should be able to install SAS Foundation.

Second, SAS requires libjpeg to run. When you first run sas, you’ll get this error:

ERROR:  BRIDGE FAILURE - ERROR LOADING IMAGE 
        MODULE: sasmotifsasvsub HyS SUBSYSTEM: 8 SLOT: 11 
ERROR: Could not load /usr/local/SASHome/SASFoundation/9.4/sasexe/sasmotif (35 images loaded) 
ERROR: libjpeg.so.62: cannot open shared object file: No such file or directory 
ERROR:  BRIDGE FAILURE - ERROR LOADING IMAGE 
        MODULE: sasmotifsasvsub HyS SUBSYSTEM: 8 SLOT: 11

Install this by running this command:

sudo apt-get install libjpeg62

SAS should now run.

Additionally, while not common, but if you have SAS/Secure installed and want to pull files via https, you’ll need to point it at SSL CA certificates. You can kick off SAS with

-SSLCALISTLOC="/etc/ssl/certs/ca-certificates.crt"

Or just create a file ~/sasv9.cfg with that in it.

Update: This works on 13.10 and 14.04

SAS Macro Variable Resolution Timing

I thought this was neat and quirky. The implications of it are important, too. When you assign a macro variable in SAS, it doesn’t resolve immediately. It resolves It will also warn you if the variable doesn’t exist yet, too.

%let x = &y;
%let y = z;
%put &x;

Will output

z

Even though, when x was defined, y did not exist. We’ll get the following warning, which can be ignored.

WARNING: Apparent symbolic reference Y not resolved.

As far as I can tell (I could be wrong!), I don’t think there’s a way of turning off that warning.

Using Secure HTTPS (SSL) URLs in SAS on Linux

Out of the box SAS 9.4 is not configured to use SSL on Linux x64. If you try to do this, you’ll get the error

ERROR: SSL Error: Missing CA trust list

This happens because when SAS gets the SSL certificate from the server, it sees that it’s signed by a Certificate Authority, but it doesn’t have the SSL certificates from those authorities. Those come with OpenSSL, so get that.

sudo apt-get install openssl

With that installed, the SSL certificates should be in /etc/ssl/certs, with ca-certificates.crt being an uber-cert that contains all of the other files in one file. To tell SAS to look at that file, add this to SASHOME/SASFoundation/9.4/sasv9_local.cfg:

-SSLCALISTLOC="/etc/ssl/certs/ca-certificates.crt"

You’ll probably need sudo to modify that file, but you probably had that if you installed OpenSSL. But maybe your admin installed OpenSSL before, but never configured SAS, and is afraid to globally fix this by editing a file in the SAS directory? You can put that config option in ~/sasv9.cfg instead, creating the file if it doesn’t exist. That will only work for you, though, and other users on the system also have to do it.

NOTE: This was tested on Ubuntu Linux x64 13.10. The location of the cert file will probably differ from distro to distro and version to version. Just look for the SSL CA cert directory, and find the biggest file in it, that’s probably the uber-cert.

Handlebars #each in a Play Scala View

If you’re using Play Scala views with Handlebars in them, and you get errors that look like these:

not found: value key
In /home/phil/myproject/app/views/myview.scala.html at line 108.

or

not found: value index
In /home/phil/myproject/app/views/myview.scala.html at line 108.

It’s probably because you’re using {{@index}} when iterating through an array with {{#each myarray}} or {{@key}} when iterating through an object’s properties.

This happens because Play framework views are Scala, and the @ character is a special character. Scala thinks you’re trying to access a Scala value (and comes back with the error above). You’ll need to escape it with @@ so Scala will pass it along as just a single @ for Handlebars to see. It looks like this:

{{#each objects}}
  {{@key}}:{{this}}
{{/each}}

Open a folder window from a cmd prompt in Windows

Easy way to open up a folder window while in a command line prompt. This opens a window to wherever you are, in case.

C:>mkdir tempdir

C:>cd tempdir

C:tempdir>explorer .

C:tempdir>

Using a Backbone Model with Handlebars

A Backbone model’s property has to be accessed with the get(property) method.

var myObj = Backbone.Model.extend({ ... });
//don't do this
console.log(myObj.myProp);

//do this instead
console.log(myObj.get('myProp'));

If you’re trying to pass this object to a Handlebars.js, things start to get messy. If you’ve got a template like this

var myTemplate = Handlebars.compile("Property: {{ myObj.myProp }}");
console.log(myTemplate(myObj));

Handlebars won’t find the property, and nothing will be displayed. There’s a well-known workaround to this though, which is to convert the Backbone Model into JSON

console.log(myTemplate(myObj.toJSON()));

However this isn’t as readable as I’d prefer. If your collection gets really big, it can make your application slow, and in situations where you’ve got a Backbone Collection of Models, it doesn’t even work. A. Matias Quezada has a better solution, which is to tell Handlebars, “hey, if it’s a Model, use the get method”:

Handlebars.JavaScriptCompiler.prototype.nameLookup = function(parent, name, type) {
	var result = '(' + parent + ' instanceof Backbone.Model ? ' + parent + '.get("' + name + '") : ' + parent;
	if (/^[0-9]+$/.test(name)) {
		return result + "[" + name + "])";
	} else if (Handlebars.JavaScriptCompiler.isValidJavaScriptVariableName(name)) {
		return result + "." + name + ')';
	} else {
		return result + "['" + name + "'])";
	}
};

That’s it. If you run that somewhere before your Template runs, Handlebars will be able to find the Backbone Model’s property.

In writing this post, I also came across Thorax.js, which probably also fixes this problem.

Mining Bitcoin in the Eligius pool using Phoenix2

I prefer to use the Eligius bitcoin mining pool because the pool takes no cut (compared to the larger, more popular BTC Guild, which takes 3-5%). The “getting started” for the Eligius Bitcoin mining pool doesn’t show how to configure Phoenix2, which is faster than, and supports more features over the older Phoenix 1.xx miner. Here’s a quick/dirty tutorial of how I set them up to work together on my machine.

NOTE:┬áIt’s cheaper to buy bitcoin on an exchange than to mine it. The cost of a dedicated machine will probably take 1-2 years to recoup.

Step 1: Download Phoenix 2

Extract it to a folder on your hard drive, like c:phoenix. Anywhere. Doesn’t really matter.

Step 2: Create phoenix.cfg

Create it in that same folder. It should contain something like this. Once running you can play around with some of the parameters.

[general]
    autodetect = +cl -cpu
    verbose = False
    backend = http://1PhiLHuzbozkqVkWYZEUiptKNGYNfne9Hb:x@mining.eligius.st:8337
    backups = http://rpcuser:rpcpassword@localhost:8332/
    failback = 600
    queuesize = 1
    ratesamples = 100
    logfile = False
[cl:0:0]
    name = GPU 0
    kernel = phatk2
    aggression = 1
    bfi_int = True
    fastloop = False
    goffset = True
    vectors = True
    vectors4 = False
    worksize = 128

Once running, you can tinker around with these last parameters to get the highest Mhash/s rate. You will want to change the username to be the backend property to be where you want your rewards to be sent when the pool finds a new block (ideally yourself). You will also need some luck for the pool to start finding Bitcoin. You can generate a higher chance of that happening by donating bitcoin to 1PhiLHuzbozkqVkWYZEUiptKNGYNfne9Hb

Step 3: There is no step 3. Just run phoenix.exe from that folder. Go find something to do while your computer makes you money.

Automatic Base SAS Library Assignments

If you stick a file named autoexec.sas in the directory where SAS is installed, it will run automatically when SAS starts up. By default, this place is C:Program FilesSASHomeSASFoundation9.3. This has worked since at least SAS 8.2, probably before.

I like to use my Windows desktop as a temporary staging area, so I have my autoexec.sas automatically assign a “DESKTOP” library with this code:

%let USERPROFILE=%sysget(USERPROFILE);
libname desktop "&USERPROFILEDesktop";

So when SAS starts up, the library is assigned and my log says this:

NOTE: Copyright (c) 2002-2010 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.3 (TS1M2)
      Licensed to Microsoft Windows for x64 All Compatible Non-Plann, Site  XXXXXXX.
NOTE: This session is executing on the X64_7PRO  platform.



NOTE: Enhanced analytical products:

SAS/STAT 12.1, SAS/ETS 12.1, SAS/OR 12.2, SAS/IML 12.1, SAS/QC 12.1

NOTE: SAS initialization used:
      real time           0.65 seconds
      cpu time            0.63 seconds


NOTE: AUTOEXEC processing beginning; file is C:Program
      FilesSASHomeSASFoundation9.3autoexec.sas.

NOTE: Libref DESKTOP was successfully assigned as follows:
      Engine:        V9
      Physical Name: C:UsersphilihpDesktop

NOTE: AUTOEXEC processing completed.

Finding the Center of US Counties in SAS

I had a problem where I needed the center longitude and latitude of each US county. SAS comes with some datasets containing map data in the format of tables that trace a line around the borders of counties. I was interested in the MAPSGFK.US_COUNTIES dataset, which has data that looks like this:

The State and County numbers were FIPS codes, which were also stored in the US_COUNTIES_ATTR table, so those are easy enough to figure out. Segment seemed to be always 1. Resolution and Density were numbers that I figured had to do with how much detail the drawing would have if these points were used in a line drawing.

The longitude and latitude here are all points along the border. The simplest, and most incorrect way of getting the center of the county would be to take an average of them, which would be fine for rectangular counties, but for territories where one side is flat (and has few waypoints) and one side is jagged (such as a coastal border), the midpoint will be weighted to that side. So instead, use geometry.

In geometry, there’s a formula for the Centroid of a polygon. This is a magic formula that I just trust is correct, because everything on Wikipedia is true, without exception. Here was my first pass at an implementation of it in SAS:

proc sort data=mapsgfk.us_counties out=centroids;
  by state county;
run;
data centroids(keep=cx cy county state);
  retain yi yj xi xj a cx cy x0 y0 0;
  set centroids(keep=state county lat long rename=(lat=yj long=xj));
  by state county;
  if(first.county) then do;
    cx = 0;
    cy = 0;
    a = 0;
    x0 = xj;
    y0 = yj;
  end;
  else if(not first.county) then do;
    ta = (xi*yj - xj*yi);
    cx + ((xi+xj)*ta);
    cy + ((yi+yj)*ta);
    a + ta;
  end;
  if(last.county) then do;
    ta = (xj*y0 - x0*yj);
    cx + ((xj+x0)*ta);
    cy + ((yj+y0)*ta);
    a  = ta + a * 0.5;
    cx = cx / (6*a);
    cy = cy / (6*a);
    output;
  end;
  xi = xj;
  yi = yj;
run;

Here, xi and yi are always the coordinates from the previous point, and “if not first.county” prevents processing of the first point of a county because it didn’t have a previous point. The variables a, cx, and cy accumulate for every point. The variable ta is the area of the rectangle defined by the two points. Once all the points have been accumulated, the area is halfed so it’s the actual area, and the area is used in calculating the centroid (cx and cy).

But then I noticed the 2nd county was positioned somewhere in the mid-Atlantic. Something had to be up. Looking closer, this was Baldwin County, which had an island. Looking at the source data, this segment was drawn in two segments, which caused the centroid formula to choke, as it assumed contiguous shapes. So instead, in my second pass, I compute the centroid of every segment, then average them weighted by their areas.

proc sort data=mapsgfk.us_counties out=centroids;
  by state county segment;
run;
data centroids_temp;
  retain yi yj xi xj a cx cy x0 y0 0;
  set centroids(keep=state county segment lat long rename=(lat=yj long=xj));
  by state county segment;
  if(first.segment) then do;
    cx = 0;
    cy = 0;
    a = 0;
    x0 = xj;
    y0 = yj;
  end;
  else if(not first.segment) then do;
    ta = (xi*yj - xj*yi);
    cx + ((xi+xj)*ta);
    cy + ((yi+yj)*ta);
    a + ta;
  end;
  if(last.segment) then do;
    ta = (xj*y0 - x0*yj);
    cx + ((xj+x0)*ta);
    cy + ((yj+y0)*ta);
    a  = ta + a * 0.5;
    cx = cx / (6*a);
    cy = cy / (6*a);
    output;
  end;
  xi = xj;
  yi = yj;
run;
proc sql;
  create table centroid_weight as
  select
    state, county, sum(a) as sum
  from centroids_temp
    group by state, county;
quit;
proc sql;
  create table centroids as
  select a.state, a.county,
         sum(cx*(a / sum)) as lat,
         sum(cy*(a / sum)) as long
    from centroids_temp a
    inner join centroid_weight b
      on (a.state = b.state and a.county = b.county)
    group by a.state, a.county;
quit;
proc sql;
  drop table centroids_temp;
  drop table centroid_weight;
quit;

Here, I add two more steps. The first proc SQL block sums up the total area of each county, which is used in the second block to find the average centroid of each county’s centroids, weighted by the total area that centroid represents, which gives us correct centroids for counties drawn in multiple segments.