tag:blogger.com,1999:blog-46789831711702521552024-03-12T17:12:59.655-07:00Arbinger SystemsUnknownnoreply@blogger.comBlogger29125tag:blogger.com,1999:blog-4678983171170252155.post-49064416475837376102010-12-21T07:34:00.000-08:002010-12-22T13:28:28.832-08:00Integrate your Perl application with Google Apps Marketplace<img src="http://www.arbingersys.com/images/googapp-controlpanel-b48.png" alt="" align="left" style="margin-right:7px" />I spent most of the last week trying to figure out how to take a Perl web app and integrate it with the <a href="http://www.google.com/enterprise/marketplace/home">Google Apps Marketplace</a>. This is where the supposedly 3 million businesses who signed up for <a href="http://www.google.com/apps/intl/en/business/index.html">Google Apps</a> go for third-party integrations.<br />
<br />
You have to sign up as a <a href="http://developer.googleapps.com/marketplace/getting-started">vendor</a> in order to make your web application available to Google Apps customers. The other requirement is that <b>your app supports OpenID Single Sign-on</b>. This is where the integration turned difficult for me.<br />
<br />
I assumed you would use <a href="http://search.cpan.org/~mart/Net-OpenID-Consumer-1.03/lib/Net/OpenID/Consumer.pm">Net::OpenID::Consumer</a> to handle the consumer-side processing. However, after only a little headway, and asking around on <a href="http://stackoverflow.com/questions/4443113/openid-authentication-to-google-apps-via-perl-and-netopenidconsumer-fails">StackOverflow</a> as well as the Google Marketplace forums, I was stuck. I could not close the OpenID circuit and continue on to my app.<br />
<br />
I eventually <b>solved the problem by switching modules</b>. I changed to the skimpily documented <a href="http://search.cpan.org/~cebjyre/Net-Google-FederatedLogin-0.5.3/">Net::Google::FederatedLogin</a>, and finally got things working. <br />
<br />
The code is as follows (substitute <em>example.com</em> below for your actual developer's domain).<br />
<br />
First, you have to login your Google Apps Marketplace vendor profile, and add the URL to index.cgi in your <b>application manifest</b>, with the required <code>${DOMAIN_NAME}</code> variable. <code>${DOMAIN_NAME}</code> will be replaced by the domain of the user who installs your app. This parameter is integral to the authentication scheme.<br />
<pre class="code" style="font-size:85%">...
<Url>http://www.example.com/index.cgi?from=google&domain=${DOMAIN_NAME}</Url>
...
</pre>The application manifest is like the installer for your web app. It's detailed <a href="http://code.google.com/googleapps/marketplace/manifest.html">here</a>, but is kind of outside of the scope of this post. <br />
<br />
Once you've gotten the application manifest done, add the following code to your servers.<br />
<br />
<b>index.cgi</b><br />
<pre class="code" style="font-size:85%">use CGI;
use Net::Google::FederatedLogin;
my $q = CGI->new();
my $domain = $q->param('domain');
if (!$domain) {
print $q->header(), 'Provide domain please.';
exit 0;
}
my $fl = Net::Google::FederatedLogin->new(
claimed_id =>
'https://www.google.com/accounts/o8/site-xrds?hd=' . $domain,
return_to =>
'http://www.example.com/return.cgi',
extensions => [
{
ns => 'ax',
uri => 'http://openid.net/srv/ax/1.0',
attributes => {
mode => 'fetch_request',
required => 'email',
type => {
email => 'http://axschema.org/contact/email'
}
}
}
] );
print $q->redirect($fl->get_auth_url());
</pre>Note that <code>$domain</code> above is used in the <code>claimed_id</code> parameter and is sent to Google for verification. The <code>extensions</code> parameter informs Google what user data to send back to your site when it redirects to <code>return_to</code>. Which, in this case, is<br />
<br />
<b>return.cgi</b><br />
<pre class="code" style="font-size:85%">use CGI;
use Net::Google::FederatedLogin;
use LWP::UserAgent;
use HTTP::Request::Common;
use URI;
use URI::Escape qw(uri_escape);
use Net::OAuth;
# OAuth (to access user's Google data)
# You get these from your vendor profile in Google Apps. Same place
# where you edit the application manifest.
my $CONSUMER_KEY = '??????????????.apps.googleusercontent.com';
my $CONSUMER_SECRET = '??????????????????';
# We want to get some calendar data from the user
my $URL =
'https://www.google.com/calendar/feeds/default/allcalendars/full';
my $q = CGI->new();
print $q->header();
# OpenID final step
my $fl = Net::Google::FederatedLogin->new(
cgi => $q,
return_to =>
'http://www.example.com/return.cgi' );
eval { $fl->verify_auth(); };
if ($@) {
print 'Error: ' . $@;
}
else {
my $ext = $fl->get_extension('http://openid.net/srv/ax/1.0');
get_calendar_oauth($ext->get_parameter('value.email'));
}
# OAuth
sub get_calendar_oauth {
my $email = shift;
my $oauth_request =
Net::OAuth->request('consumer')->new(
consumer_key => $CONSUMER_KEY,
consumer_secret => $CONSUMER_SECRET,
request_url => $URL,
request_method => 'GET',
signature_method => 'HMAC-SHA1',
timestamp => time,
nonce => nonce(),
extra_params => {
'xoauth_requestor_id' => $email
},
);
$oauth_request->sign();
my $req = HTTP::Request->new(
GET => $URL . '?xoauth_requestor_id=' . uri_escape($email) );
$req->header('Content-type' => 'application/atom+xml');
$req->header(
'Authorization' => $oauth_request->to_authorization_header);
my $ua = LWP::UserAgent->new;
my $oauth_response = $ua->simple_request($req);
while($oauth_response->is_redirect) {
my $url = URI->new($oauth_response->header('Location'));
$req->uri($url);
my %query = $url->query_form;
foreach my $param (keys %query) {
$oauth_request->{extra_params}->{$param} = $query{$param};
}
$url->query(undef); # clear out the query parameters
$oauth_request->{request_url} = $url;
$oauth_request->sign; # resign
$req->header(
'Authorization' => $oauth_request->to_authorization_header );
$oauth_response = $ua->simple_request($req);
}
print $oauth_response->as_string;
} # get_calendar_oauth
sub nonce {
my @a = ('A'..'Z', 'a'..'z', 0..9);
my $nonce = '';
for(0..31) {
$nonce .= $a[rand(scalar(@a))];
}
$nonce;
}
</pre>The final OpenID step is quite minimal, as you can see above. You simply create a new Net::Google::FederatedLogin object and pass it the CGI object plus <code>return_to</code> value. Then you verify, and if there isn't an error, you should be able to access the extension data via the call to <code>get_extension()</code>.<br />
<br />
Much of the above script is devoted to doing <b>OAuth</b> in order to access the user's Google data, in this case his calendar. If you only need to authenticate a user and <em>not</em> access Google data, you could omit the call to <code>get_calendar_oauth()</code> entirely.<br />
<br />
<b>OAuth</b><br />
<br />
When you create your app in the vendor section of Google Apps Marketplace, it will be assigned a <b>Consumer Secret</b> and a <b>Consumer Key</b>. These must be present in the parameters when you instantiate your Net::OAuth object. In the above code, you would set <code>$CONSUMER_KEY</code> and <code>$CONSUMER_SECRET</code> to these values.<br />
<br />
The data is returned as Atom/XML. In the above code I do nothing with it except print it out. The code in <code>get_calendar_oauth</code> has been borrowed almost directly from this <a href="http://blog.case.edu/jeremy.smith/2009/03/30/using_2legged_oauth_with_google_apps_in_perl">blog post</a> by Jeremy Smith.<br />
<br />
That's basically it. This was intended to be a sparse example covering the two main points for integrating with Google Apps from Perl -- <b>OpenID</b> to grant access to your app via Google credentials, and <b>OAuth</b> for accessing Google data on behalf of the user.Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-4678983171170252155.post-47832965783335191302010-11-30T13:42:00.000-08:002010-11-30T14:49:24.038-08:00Bookmarklets versus Man-In-The-Middle attacks<img align="left" alt="" src="http://www.arbingersys.com/images/web-secure.png" style="margin-right: 5px;" />Let me start off by saying I don't consider myself a security expert. As a web systems developer I've had to become knowledgeable about security, e.g client-side password hashing, salted hashes, PKI, etc. But like many I've relied quite a bit on TLS/SSL to ensure that data moving between my systems and users is safe. In fact, if I were completely honest, I'd say it's been something of a crutch. If we point users to an <b>https</b> link, we feel like we've done what's necessary for security.<br />
<br />
TLS/SSL has a pretty serious weakness, however, the <a href="http://en.wikipedia.org/wiki/Man-in-the-middle_attack">Man-In-The-Middle</a> attack. And MITM is a fairly trivial thing to do, thanks to the <a href="http://en.wikipedia.org/wiki/Address_Resolution_Protocol">Address Resolution Protocol</a>, which is used nearly everywhere for one physical device to find another on a network.<br />
<br />
MITM is also trivial to do because smart and devious people like <a href="http://www.thoughtcrime.org/">Moxie Marlinspike</a> have exploited these weaknesses and created tools like <a href="http://www.thoughtcrime.org/software/sslstrip/">sslstrip</a>. With available tools and only a reasonable amount of knowledge, a "script kiddie" <a href="http://www.youtube.com/watch?v=Dd5qGS-5C0I">can implement MITM in around 2 minutes</a>, and pretty easily trick you into giving away information. Think about that the next time you want to do your banking while sitting in a coffee shop.<br />
<br />
The MITM attack is very difficult to circumvent programmatically, because (if you watched the video at the sslstrip link above) the attacker has the page first, and is able to manipulate it in subtle ways that are hard to detect. For instance, <b>stripping out https links</b> so when you login somewhere, you send your credentials for the attacker to view and capture.<br />
<br />
<h2 style="font-size: 116%;">Bookmarklets</h2><br />
Recently, I began to think about safe ways to do logins, <b>assuming that a MITM attack was under way</b>.<br />
<br />
You could use public/private key encryption like <a href="http://www-cs-students.stanford.edu/~tjw/jsbn/">RSA</a> to encrypt the username and password with a public key before sending. However, if someone is in the middle, they could just as well manipulate the code to use a public key of their own, and then decrypt your credentials. It makes the attack harder, but not impossible.<br />
<br />
So how can you ensure that the key (and code) you've obtained from the server hasn't been tampered with? This is where I thought of using bookmarklets. <a href="http://en.wikipedia.org/wiki/Bookmarklet">Bookmarklets</a> are typically a bit of compressed JavaScript code that is stored in a link. When clicked, the JavaScript runs in the context of the current page. My idea was to do the following:<br />
<br />
1. Create a login page that uses public key encryption to encrypt credentials before sending. Embed the public key in the page.<br />
<br />
2. Use a hash function to generate a signature for the login page.<br />
<br />
3. Embed the hash function along with the hash from Step 2 in a bookmarklet that you make available to users. Ask them to add it to their Bookmarks.<br />
<br />
4. When users visit your login page, they would click the bookmarklet <b>from their Bookmarks</b>. It would process the current page and generate a hash, and compare it to the one in the bookmarklet. If the hashes didn't match, the user would be alerted.<br />
<br />
<h2 style="font-size: 116%;">Proof-of-concept</h2><br />
I ended up doing the following to test my theory. First, I created a <b>bookmarklet for developers</b>. When clicked, it traverses the page you're currently visiting and extracts text, elements, and attributes, generating a SHA1 hash from the combined values. It then outputs the JavaScript code along with the hash, which you can turn into a bookmarklet for <b>your users</b>.<br />
<br />
To use it, drag the below link to your Bookmarks, visit your [login] page, and click the link. The code for the bookmarklet of your page will pop up in a new window. Use that code to make a bookmarklet to add to your site.<br />
<br />
<img align="absmiddle" alt="" src="http://www.arbingersys.com/t/bkm/bookmarklet.jpeg" /><a href="javascript:(function(){function%20SHA1(msg){function%20rotate_left(n,s){var%20t4=(n%3C%3Cs)|(n%3E%3E%3E(32-s));return%20t4;};function%20lsb_hex(val){var%20str=%22%22;var%20i;var%20vh;var%20vl;for(i=0;i%3C=6;i+=2){vh=(val%3E%3E%3E(i*4+4))%260x0f;vl=(val%3E%3E%3E(i*4))%260x0f;str+=vh.toString(16)+vl.toString(16);}return%20str;};function%20cvt_hex(val){var%20str=%22%22;var%20i;var%20v;for(i=7;i%3E=0;i--){v=(val%3E%3E%3E(i*4))%260x0f;str+=v.toString(16);}return%20str;};function%20Utf8Encode(string){string=string.replace(/\r\n/g,%22\n%22);var%20utftext=%22%22;for(var%20n=0;n%20%3C%20string.length;n++){var%20c=string.charCodeAt(n);if(c%20%3C%20128){utftext+=String.fromCharCode(c);}else%20if((c%20%3E%20127)%26%26(c%20%3C%202048)){utftext+=String.fromCharCode((c%20%3E%3E%206)|%20192);utftext+=String.fromCharCode((c%20%26%2063)|%20128);}else{utftext+=String.fromCharCode((c%20%3E%3E%2012)|%20224);utftext+=String.fromCharCode(((c%20%3E%3E%206)%26%2063)|%20128);utftext+=String.fromCharCode((c%20%26%2063)|%20128);}}return%20utftext;};var%20blockstart;var%20i,j;var%20W=new%20Array(80);var%20H0=0x67452301;var%20H1=0xEFCDAB89;var%20H2=0x98BADCFE;var%20H3=0x10325476;var%20H4=0xC3D2E1F0;var%20A,B,C,D,E;var%20temp;msg=Utf8Encode(msg);var%20msg_len=msg.length;var%20word_array=new%20Array();for(i=0;i%3Cmsg_len-3;i+=4){j=msg.charCodeAt(i)%3C%3C24%20|%20msg.charCodeAt(i+1)%3C%3C16%20|msg.charCodeAt(i+2)%3C%3C8%20|%20msg.charCodeAt(i+3);word_array.push(j);}switch(msg_len%254){case%200:i=0x080000000;break;case%201:i=msg.charCodeAt(msg_len-1)%3C%3C24%20|%200x0800000;break;case%202:i=msg.charCodeAt(msg_len-2)%3C%3C24%20|%20msg.charCodeAt(msg_len-1)%3C%3C16%20|%200x08000;break;case%203:i=msg.charCodeAt(msg_len-3)%3C%3C24%20|%20msg.charCodeAt(msg_len-2)%3C%3C16%20|%20msg.charCodeAt(msg_len-1)%3C%3C8%20|%200x80;break;}word_array.push(i);while((word_array.length%2516)!=14)word_array.push(0);word_array.push(msg_len%3E%3E%3E29);word_array.push((msg_len%3C%3C3)%260x0ffffffff);for(blockstart=0;blockstart%3Cword_array.length;blockstart+=16){for(i=0;i%3C16;i++)W[i]=word_array[blockstart+i];for(i=16;i%3C=79;i++)W[i]=rotate_left(W[i-3]^%20W[i-8]^%20W[i-14]^%20W[i-16],1);A=H0;B=H1;C=H2;D=H3;E=H4;for(i=0;i%3C=19;i++){temp=(rotate_left(A,5)+((B%26C)|(~B%26D))+E+W[i]+0x5A827999)%26%200x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp;}for(i=20;i%3C=39;i++){temp=(rotate_left(A,5)+(B%20^%20C%20^%20D)+E+W[i]+0x6ED9EBA1)%26%200x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp;}for(i=40;i%3C=59;i++){temp=(rotate_left(A,5)+((B%26C)|(B%26D)|(C%26D))+E+W[i]+0x8F1BBCDC)%26%200x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp;}for(i=60;i%3C=79;i++){temp=(rotate_left(A,5)+(B%20^%20C%20^%20D)+E+W[i]+0xCA62C1D6)%26%200x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp;}H0=(H0+A)%26%200x0ffffffff;H1=(H1+B)%26%200x0ffffffff;H2=(H2+C)%26%200x0ffffffff;H3=(H3+D)%26%200x0ffffffff;H4=(H4+E)%26%200x0ffffffff;}var%20temp=cvt_hex(H0)+cvt_hex(H1)+cvt_hex(H2)+cvt_hex(H3)+cvt_hex(H4);return%20temp.toLowerCase();}var%20BKM=%22function%20SHA1(msg){function%20rotate_left(n,s){var%20t4=(n%3C%3Cs)|(n%3E%3E%3E(32-s));return%20t4};function%20lsb_hex(val){var%20str=\%22\%22;var%20i;var%20vh;var%20vl;for(i=0;i%3C=6;i+=2){vh=(val%3E%3E%3E(i*4+4))%260x0f;vl=(val%3E%3E%3E(i*4))%260x0f;str+=vh.toString(16)+vl.toString(16)}return%20str};function%20cvt_hex(val){var%20str=\%22\%22;var%20i;var%20v;for(i=7;i%3E=0;i--){v=(val%3E%3E%3E(i*4))%260x0f;str+=v.toString(16)}return%20str};function%20Utf8Encode(string){string=string.replace(/\\r\\n/g,\%22\\n\%22);var%20utftext=\%22\%22;for(var%20n=0;n%3Cstring.length;n++){var%20c=string.charCodeAt(n);if(c%3C128){utftext+=String.fromCharCode(c)}else%20if((c%3E127)%26%26(c%3C2048)){utftext+=String.fromCharCode((c%3E%3E6)|192);utftext+=String.fromCharCode((c%2663)|128)}else{utftext+=String.fromCharCode((c%3E%3E12)|224);utftext+=String.fromCharCode(((c%3E%3E6)%2663)|128);utftext+=String.fromCharCode((c%2663)|128)}}return%20utftext};var%20blockstart;var%20i,j;var%20W=new%20Array(80);var%20H0=0x67452301;var%20H1=0xEFCDAB89;var%20H2=0x98BADCFE;var%20H3=0x10325476;var%20H4=0xC3D2E1F0;var%20A,B,C,D,E;var%20temp;msg=Utf8Encode(msg);var%20msg_len=msg.length;var%20word_array=new%20Array();for(i=0;i%3Cmsg_len-3;i+=4){j=msg.charCodeAt(i)%3C%3C24|msg.charCodeAt(i+1)%3C%3C16|msg.charCodeAt(i+2)%3C%3C8|msg.charCodeAt(i+3);word_array.push(j)}switch(msg_len%254){case%200:i=0x080000000;break;case%201:i=msg.charCodeAt(msg_len-1)%3C%3C24|0x0800000;break;case%202:i=msg.charCodeAt(msg_len-2)%3C%3C24|msg.charCodeAt(msg_len-1)%3C%3C16|0x08000;break;case%203:i=msg.charCodeAt(msg_len-3)%3C%3C24|msg.charCodeAt(msg_len-2)%3C%3C16|msg.charCodeAt(msg_len-1)%3C%3C8|0x80;break}word_array.push(i);while((word_array.length%2516)!=14)word_array.push(0);word_array.push(msg_len%3E%3E%3E29);word_array.push((msg_len%3C%3C3)%260x0ffffffff);for(blockstart=0;blockstart%3Cword_array.length;blockstart+=16){for(i=0;i%3C16;i++)W[i]=word_array[blockstart+i];for(i=16;i%3C=79;i++)W[i]=rotate_left(W[i-3]^W[i-8]^W[i-14]^W[i-16],1);A=H0;B=H1;C=H2;D=H3;E=H4;for(i=0;i%3C=19;i++){temp=(rotate_left(A,5)+((B%26C)|(~B%26D))+E+W[i]+0x5A827999)%260x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp}for(i=20;i%3C=39;i++){temp=(rotate_left(A,5)+(B^C^D)+E+W[i]+0x6ED9EBA1)%260x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp}for(i=40;i%3C=59;i++){temp=(rotate_left(A,5)+((B%26C)|(B%26D)|(C%26D))+E+W[i]+0x8F1BBCDC)%260x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp}for(i=60;i%3C=79;i++){temp=(rotate_left(A,5)+(B^C^D)+E+W[i]+0xCA62C1D6)%260x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp}H0=(H0+A)%260x0ffffffff;H1=(H1+B)%260x0ffffffff;H2=(H2+C)%260x0ffffffff;H3=(H3+D)%260x0ffffffff;H4=(H4+E)%260x0ffffffff}var%20temp=cvt_hex(H0)+cvt_hex(H1)+cvt_hex(H2)+cvt_hex(H3)+cvt_hex(H4);return%20temp.toLowerCase()}var%20a='';function%20tv(nodes){for(var%20i=0;i%3Cnodes.length;i++){if(nodes[i].nodeName=='%23text'){a+=nodes[i].nodeValue.replace(/\\s*/g,'').replace(/'|\%22/g,'')}else{a+=nodes[i].nodeName;if(nodes[i].attributes!=null){var%20attrs=nodes[i].attributes;for(var%20j=0;j%3Cattrs.length;j++){a+=attrs[j].nodeName+attrs[j].nodeValue.replace(/\\s*/g,'').replace(/'|\%22/g,'')}}}tv(nodes[i].childNodes)}}tv(window.document.childNodes);var%20b='__CMPSTR__';if(SHA1(a)==b){alert('Page%20validated.')}else{alert('Page%20NOT%20validated.')}%22;var%20a='';function%20tv(nodes){for(var%20i=0;i%20%3C%20nodes.length;i++){if(nodes[i].nodeName=='%23text'){a+=nodes[i].nodeValue.replace(/\s*/g,'').replace(/'|%22/g,'');}else{a+=nodes[i].nodeName;if(nodes[i].attributes%20!=null){var%20attrs=nodes[i].attributes;for(var%20j=0;j%20%3C%20attrs.length;j++){a+=attrs[j].nodeName+attrs[j].nodeValue.replace(/\s*/g,'').replace(/'|%22/g,'');}}}tv(nodes[i].childNodes);}}tv(window.document.childNodes);w=window.open();w.document.write('Convert%20the%20following%20code%20to%20a%20bookmarklet%20and%20paste%20into%20your%20markup.'+'%20%3Ca%20href=%22%23%22%20onclick=%22window.open('%20+%22'http://chris.zarate.org/projects/bookmarkleter/',%22+%22'bkm','height=480,width=800,resizable=1,scrollbars=1');%22+';return%20false;%22%3EBookmarklet%20converter%3C/a%3E'%20+'%3Ctextarea%20style=%22width:100%25;height:100%25%22%3E'+BKM.replace('__CMPSTR__',SHA1(a))+'%3C/textarea%3E');})();">Generate SHA1 Validation Bookmarklet</a> <br />
<br />
Using the above validation-bookmarklet-generator-bookmarklet :), I've created the following example of a pretend login page that uses RSA to encrypt credentials before sending.<br />
<br />
Here's the <b>validation bookmarklet</b> for my pretend login page:<br />
<br />
<img align="absmiddle" alt="" src="http://www.arbingersys.com/t/bkm/bookmarklet.jpeg" /><a href="javascript:(function(){function%20SHA1(msg){function%20rotate_left(n,s){var%20t4=(n%3C%3Cs)|(n%3E%3E%3E(32-s));return%20t4};function%20lsb_hex(val){var%20str=%22%22;var%20i;var%20vh;var%20vl;for(i=0;i%3C=6;i+=2){vh=(val%3E%3E%3E(i*4+4))%260x0f;vl=(val%3E%3E%3E(i*4))%260x0f;str+=vh.toString(16)+vl.toString(16)}return%20str};function%20cvt_hex(val){var%20str=%22%22;var%20i;var%20v;for(i=7;i%3E=0;i--){v=(val%3E%3E%3E(i*4))%260x0f;str+=v.toString(16)}return%20str};function%20Utf8Encode(string){string=string.replace(/\r\n/g,%22\n%22);var%20utftext=%22%22;for(var%20n=0;n%3Cstring.length;n++){var%20c=string.charCodeAt(n);if(c%3C128){utftext+=String.fromCharCode(c)}else%20if((c%3E127)%26%26(c%3C2048)){utftext+=String.fromCharCode((c%3E%3E6)|192);utftext+=String.fromCharCode((c%2663)|128)}else{utftext+=String.fromCharCode((c%3E%3E12)|224);utftext+=String.fromCharCode(((c%3E%3E6)%2663)|128);utftext+=String.fromCharCode((c%2663)|128)}}return%20utftext};var%20blockstart;var%20i,j;var%20W=new%20Array(80);var%20H0=0x67452301;var%20H1=0xEFCDAB89;var%20H2=0x98BADCFE;var%20H3=0x10325476;var%20H4=0xC3D2E1F0;var%20A,B,C,D,E;var%20temp;msg=Utf8Encode(msg);var%20msg_len=msg.length;var%20word_array=new%20Array();for(i=0;i%3Cmsg_len-3;i+=4){j=msg.charCodeAt(i)%3C%3C24|msg.charCodeAt(i+1)%3C%3C16|msg.charCodeAt(i+2)%3C%3C8|msg.charCodeAt(i+3);word_array.push(j)}switch(msg_len%254){case%200:i=0x080000000;break;case%201:i=msg.charCodeAt(msg_len-1)%3C%3C24|0x0800000;break;case%202:i=msg.charCodeAt(msg_len-2)%3C%3C24|msg.charCodeAt(msg_len-1)%3C%3C16|0x08000;break;case%203:i=msg.charCodeAt(msg_len-3)%3C%3C24|msg.charCodeAt(msg_len-2)%3C%3C16|msg.charCodeAt(msg_len-1)%3C%3C8|0x80;break}word_array.push(i);while((word_array.length%2516)!=14)word_array.push(0);word_array.push(msg_len%3E%3E%3E29);word_array.push((msg_len%3C%3C3)%260x0ffffffff);for(blockstart=0;blockstart%3Cword_array.length;blockstart+=16){for(i=0;i%3C16;i++)W[i]=word_array[blockstart+i];for(i=16;i%3C=79;i++)W[i]=rotate_left(W[i-3]^W[i-8]^W[i-14]^W[i-16],1);A=H0;B=H1;C=H2;D=H3;E=H4;for(i=0;i%3C=19;i++){temp=(rotate_left(A,5)+((B%26C)|(~B%26D))+E+W[i]+0x5A827999)%260x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp}for(i=20;i%3C=39;i++){temp=(rotate_left(A,5)+(B^C^D)+E+W[i]+0x6ED9EBA1)%260x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp}for(i=40;i%3C=59;i++){temp=(rotate_left(A,5)+((B%26C)|(B%26D)|(C%26D))+E+W[i]+0x8F1BBCDC)%260x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp}for(i=60;i%3C=79;i++){temp=(rotate_left(A,5)+(B^C^D)+E+W[i]+0xCA62C1D6)%260x0ffffffff;E=D;D=C;C=rotate_left(B,30);B=A;A=temp}H0=(H0+A)%260x0ffffffff;H1=(H1+B)%260x0ffffffff;H2=(H2+C)%260x0ffffffff;H3=(H3+D)%260x0ffffffff;H4=(H4+E)%260x0ffffffff}var%20temp=cvt_hex(H0)+cvt_hex(H1)+cvt_hex(H2)+cvt_hex(H3)+cvt_hex(H4);return%20temp.toLowerCase()}var%20a='';function%20tv(nodes){for(var%20i=0;i%3Cnodes.length;i++){if(nodes[i].nodeName=='%23text'){a+=nodes[i].nodeValue.replace(/\s*/g,'').replace(/'|%22/g,'')}else{a+=nodes[i].nodeName;if(nodes[i].attributes!=null){var%20attrs=nodes[i].attributes;for(var%20j=0;j%3Cattrs.length;j++){a+=attrs[j].nodeName+attrs[j].nodeValue.replace(/\s*/g,'').replace(/'|%22/g,'')}}}tv(nodes[i].childNodes)}}tv(window.document.childNodes);var%20b='3b7bb0cf6ca0a3c9d7c38e92c25511a8314c3cd9';if(SHA1(a)==b){alert('Page%20validated.')}else{alert('Page%20NOT%20validated.')}})();">CheckPage!</a> <br />
<br />
<b>Drag it to your Bookmarks</b>. Then, visit the login page below and click the <b>CheckPage!</b> bookmarklet to verify the login form hasn't been tampered with:<br />
<br />
<a href="http://www.arbingersys.com/t/bkm/form.html" target="_blank">Fake login page with RSA encryption</a><br />
<br />
<i>(You can even click Login to see the encrypted info that will be sent to the server.)</i><br />
<br />
As long as a user clicks the validation bookmarklet and heeds its warning (that is, doesn't continue regardless), an MITM attack could be mitigated, if not altogether prevented for this page.<br />
<br />
<h2 style="font-size: 116%;">Issues and miscellany</h2><br />
There are a few issues I see with this idea. There may (probably?) be more, but these are the ones that immediately come to mind.<br />
<br />
1. A MITM attack may modify the page to remove the user hint to click the bookmarklet first, relying on users to be forgetful. The page isn't disabled in any way, so if it was compromised and the user goes ahead, their credentials may be captured.<br />
<br />
2. An attacker may trick the user into 'updating' the bookmarklet to do nothing but alert them the page they are viewing is validated.<br />
<br />
3. If every login page for every service created a bookmarklet, users wouldn't be able to manage them very well, and they'd become rather cumbersome.<br />
<br />
For developers, it also becomes a hassle if the login page changes, even slightly. They will be dealing with a lot of users who suddenly can't validate the login page, and are possibly panicking.<br />
<br />
I don't in any way think this is a panacea, but I do feel like the idea has some merit, and possibly there are other, better ways to implement something in a similar vein.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-2827324510840127492010-10-02T09:27:00.000-07:002010-10-05T10:47:36.604-07:00Google Document List API NetBeans sample<img src="http://www.arbingersys.com/images/netbeans.png" alt="" align="left" style="padding-right:3px" />I've recently been doing work in Java using Google's <a href="http://code.google.com/apis/documents/docs/3.0/developers_guide_java.html">Document List API v3.0</a>. It's well documented and there are some basic "Hello World" samples available, but not a single sample that fully demonstrates the basics of using the API.<br />
<br />
In teaching myself the API, that's where I started. I took a lot of the code from their documentation pages, and created a single <a href="http://www.netbeans.org/">NetBeans</a> project that connects to Google Docs, lists your documents, creates a new folder, creates a new file in that folder, etc. It was a good starting point for the app I was building, and is probably a good starting point for anyone else who wants to build an app that interfaces with Google Docs.<br />
<br />
One of the most annoying parts of creating the project was figuring out <b>just</b> the dependencies that were needed, and getting those <b>.jar</b> files in the right place. For convenience, I include all the necessary .jar files in the download. They're in the <b>Libraries/</b> directory, and are already referenced by the project.<br />
<br />
You can download the sample here: <a href="http://www.arbingersys.com/dnlds/GDocsSample.zip">http://www.arbingersys.com/dnlds/GDocsSample.zip</a><br />
<br />
If you have NetBeans installed, simply choose <b>File | Open Project</b> and browse to the GDocsSample/ folder that you've extracted from the archive.<br />
<br />
Under <b>Source Packages</b> double-click <b>Main.java</b> and it will open in the editor. On <b>line 59</b> you'll want to modify the <code>client.setUserCredentials()</code> call with your Gmail credentials. Then you should be ready to build and run the project.<br />
<br />
Enjoy!Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-4678983171170252155.post-86173084520697486652010-06-25T07:29:00.000-07:002010-07-06T08:35:18.480-07:00I'm a believer - Chrome + JavaScript = fast<b>Update 7/6/2010: I re-ran all the tests on a Windows 7 machine in order to include the IE9 Preview, which I can't install on Windows XP. The results were consistent with my previous tests - Chrome wins.</b><br />
<br />
In working on an API with a requirement to process a large amount of data (> 5MB) client-side in a browser, I needed to find a way to make JavaScript behave in a thread-like manner. I came across the <code>setTimeout()</code> function, and the following <a href="http://www.julienlecomte.net/blog/2007/10/28/">pattern</a> from Julien Lecomte's <a href="http://www.julienlecomte.net/blog/">excellent blog</a>.<br />
<br />
This pattern effectively allows you to execute long-running processes <b>without locking up your browser</b> and making it unresponsive.<br />
<br />
It worked as expected, but it <b>seemed slow running under Firefox</b>. I was developing on a VM of Ubuntu, so I'm sure that had something to do with it. However, I kept tweaking parameters and optimizing my code to see if I could get a bit more response from the pattern. I did, but it was marginal. I was processing 5MB of data client-side in around 1.7 minutes under VM, so I reasoned it would probably be faster in real life.<br />
<br />
Then, I decided to try Chrome. I was completely stunned. <b>It ran in about 15 seconds</b>. I tested more, but the results were consistent. I also have Opera installed, so I started it up, and the results were even worse than Mozilla. In fact, I got tired of waiting for it to finish, so I just killed it.<br />
<br />
But now I was curious why it was happening. Was it the pattern itself, or was the process my code was running simply taking longer in the other browsers? I know Chrome's V8 is <a href="http://news.cnet.com/8301-1001_3-10030888-92.html">supposed to be faster</a>, but I wonder why it's so noticeable on this particular pattern.<br />
<br />
I decided to try a different test. I borrowed Julien's example code from the above post and saved it to a Windows XP workstation with IE, Firefox, and Chrome installed. I set the length of Julien's array to be sorted to <code>length = 5000;</code> so it would take longer in all browsers. Then I launched the page and let it sort. <b>Chrome, again, is the clear winner</b>. Here are the results, from fastest to slowest:<br />
<br />
<b>Chrome:</b><br />
<img alt="IE" src="http://www.arbingersys.com/images/chrome.png" /><br />
<br />
<b>Firefox:</b><br />
<img alt="IE" src="http://www.arbingersys.com/images/firefox.png" /><br />
<br />
<b>IE 9 Preview:</b><br />
<img alt="IE" src="http://www.arbingersys.com/images/ie9.png" /><br />
<br />
<b>IE 8:</b><br />
<img alt="IE" src="http://www.arbingersys.com/images/ie.png" /><br />
<br />
I ran each browser a couple of times just to be sure the results were consistent. (Hardly rigorous testing, I know, but enough to satisfy me.) <br />
<br />
So I've heard that Chrome has the fastest JavaScript engine, but now I've actually experienced it for myself. However, I'm left to wonder <b>why it's so apparent in this code</b>? My guess is the way in which the 'continuation' of the anonymous function is implemented in the various engines. Perhaps somebody with a deeper knowledge of the internals knows better?<br />
<br />
<b>Update</b><br />
<br />
It turns out, somebody did know better. I asked on the Chrome forums, and Erik Kay, one of the Chrome hackers, indicated that the speed increase is most likely due to Chrome's timer implementation. Here's his response:<br />
<br />
<a href="http://groups.google.com/group/v8-users/browse_thread/thread/efb5fcc1c94aafa6">http://groups.google.com/group/v8-users/browse_thread/thread/efb5fcc1c94aafa6</a><br />
<br />
He pointed me to the following blog post that gives a detailed account of how the Chrome team developed the timer system, and why it's so fast. It's totally worth the read:<br />
<br />
<a href="http://www.belshe.com/2010/06/04/chrome-cranking-up-the-clock/">http://www.belshe.com/2010/06/04/chrome-cranking-up-the-clock/</a><br />
<br />
One more thing. There's also this page, which tests the frequency of the timer implementation in your browser:<br />
<br />
<a href="http://www.belshe.com/test/timers.html">http://www.belshe.com/test/timers.html</a>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-62083583652754954162010-04-29T08:07:00.000-07:002010-04-29T08:17:14.323-07:00In RE: Thoughts on Flash by Steve JobsA friend of mine sent me the <a href="http://www.apple.com/hotnews/thoughts-on-flash/">Steve Jobs open letter to Adobe</a> concerning Flash. I replied to him via email, but I thought it might be good fodder for the blog, which is desperately in need of some love. Here's my response to him, without edits:<br />
<br />
<div style="background-color:#eeeeee;padding:4px;margin:6px">Interesting. He makes some good points, but there's also plenty of hubris, in my opinion. <br />
<br />
This -- "Though the operating system for the iPhone, iPod and iPad is proprietary, we strongly believe that all standards pertaining to the web should be open" -- just sounds hypocritical and self-serving, not to mention bitter that Flash managed to become the de facto standard of web media.<br />
<br />
Personally, I'd say that Apple and Adobe are pretty much the same. Flash could be considered 'open' because the SWF format is published & well-known, e.g. there are other players for it, just check out Linux. Adobe controls the Flash player, but doesn't control how flash files are made, exported, converted, etc. Apple makes no apologies for locking down what they can, so why expect Adobe to?<br />
<br />
His points about Flash being designed for PCs and mice are spot on, though. And his 6th point makes sense just from a strategy stand-point. I think this is less about Mr. Job's "ideals", and more about severing a dependency that seems dangerous to Apple.<br />
<br />
He's not just fighting Adobe, though. It's like Windows. Part of its staying power is all the third-parties that bought in to the platform, and have created things people want, and who will continue to create things for it. He not only upsets Adobe, but millions of Flash developers.<br />
<br />
(Ok, now I gotta go back to work.)<br />
</div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-86194374575148772692009-10-05T09:26:00.000-07:002009-10-06T09:18:37.426-07:00A minimal jQuery source for a fade behind pop-upI recently wanted to do one of those nice trendy popups that stay within the current web page and fades everything behind the pop-up. I wanted to use it to allow a user to view a demo, a Flash animation. Pretty typical usage from what I've seen.<br />
<br />
I figured this was something done handily by jQuery, but I had some trouble finding a minimal, complete source to start with. Everyone seemed to want to force you to go through the tutorial they wrote, step by step. Well, I usually want the code, and then the tutorial.<br />
<br />
I found <a href="http://blog.zen-dreams.com/en/2009/06/09/creating-a-popup-window-with-jquery/">this tutorial</a> which was at least succinct. Soon I had a very small (i.e. minimal), working .html document that behaved how I wanted. For instance, it automatically figures out the horizontal and vertical positioning of the pop-up so it comes up in the center of the screen.<br />
<br />
Here you go:<br />
<pre class="code" style="font-size:85%"><html>
<head>
<title></title>
<style>
#popup {
height: 100%;
width: 100%;
background-color: #000000;
position: absolute;
top: 0;
}
#window {
width: 500px;
height: 400px;
margin: 0 auto;
border: 1px solid #000000;
background: #ffffff;
position: absolute;
top: 10%;
left: 15%;
}
</style>
<script type="text/javascript"
src="http://jqueryjs.googlecode.com/files/jquery-1.3.2.min.js"></script>
<script type="text/javascript">
function Show_Popup(action, userid) {
var hpos = ($(window).height()/2) - (400/2);
var wpos = ($(window).width()/2) - (500/2);
$('#popup').css('opacity',0.75).fadeIn('fast');
$('#window').css('top', hpos + 'px').css('left', wpos + "px").fadeIn('fast');
// I added a function call here to insert my demo into the #window div
}
function Close_Popup() {
$('#popup').fadeOut('fast');
$('#window').fadeOut('fast');
}
</script>
</head>
<body>
<div onclick="Show_Popup()"
style="text-decoration:underline">
View demo
</div>
<div id="popup" style="display: none;"></div>
<div id="window" style="display: none;">
<div id="popup_content">
<a href="#" onclick="Close_Popup();" >Close</a>
</div>
</div>
</body>
</html>
</pre><br />
And now for the tutorial, also minimal:<br />
<br />
(1) Make sure that the <code><div id="popup" ... </div></code> section is placed into your page just prior to the <code></body></code> tag.<br />
<br />
(2) It's unlikely that your popup height and width will be the same as mine. You'll need to modify in two places to change this - inelegant I know - in the #window style declaration, and in the <code>Show_Popup()</code> function, where <code>hpos</code> and <code>wpos</code> are calculated.<br />
<br />
Here's the <a href="http://www.arbingersys.com/popup.html">demo</a> page.Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-4678983171170252155.post-71024403179648163562009-07-22T13:18:00.000-07:002009-08-08T15:07:37.948-07:00The Missing GObject Tutorial Sample<img src="http://library.gnome.org/skin/library.png" align="left" style="margin:5px" /><br />Well, perhaps not exactly, but still -- I think it should work.<br /><br />I've recently started poking around the GObject library, which is part of GLib. GObject is a C library <a href="http://library.gnome.org/devel/gobject/unstable/pr01.html">aimed at</a> providing OOP programmability that easily integrates with (usually) dynamic third-party languages. Basically, it allows you to write "glue" code between <code>$your_dynamic_language</code> and the GObject library just once, and then hook into any libraries created with GObject without writing further glue.<br /><br />A good, detailed tutorial is available <a href="http://library.gnome.org/devel/gobject/unstable/">here</a>, which I've been working through. After getting the gist of something, I get itchy for some sample code to play around with. So I Googled and found <a href="http://library.gnome.org/devel/gobject/unstable/howto-gobject.html">this how-to</a>. An older version of the documentation mentioned some sample code that I was <a href="http://www.nabble.com/Where-is-source-code-of-http:--developer.gnome.org-doc-API-2.0-gobject-index.html-documentation---td11207371.html">never able to find</a>.<br /><br />After giving up on that, I decided that I should be able to use the tutorial to scrap together my own sample. The tutorial was pretty detailed, after all, and apparently referenced a sample that did (or does) exist somewhere. <b>So that's what I did.</b> I've created a fully functioning sample based more-or-less on the tutorial mentioned above. The code is below, with comments.<br /><br />I have compiled this on my Ubuntu 8.10 and 9.04 machines using the following command:<br /><br /><code>gcc `pkg-config --libs gtk+-2.0` `pkg-config --cflags gtk+-2.0` maman-bar.c</code><br /><br /><b>maman-bar.h</b><br /><pre style="overflow:auto;margin-top:1em"><br />/*<br /> * Copyright/Licensing information.<br /> *<br /> * Reference:<br /> *<br /> * http://library.gnome.org/devel/gobject/unstable/howto-gobject.html<br /> * http://library.gnome.org/devel/gobject/unstable/chapter-gobject.html<br /> *<br /> *<br /> */<br /><br /><br />/* inclusion guard */<br />#ifndef __MAMAN_BAR_H__<br />#define __MAMAN_BAR_H__<br /><br />#include <glib-object.h><br /><br />/*<br /> * Potentially, include other headers on which this header depends.<br /> */<br /><br />/*<br /> * Type macros.<br /> */<br />#define MAMAN_TYPE_BAR (maman_bar_get_type ())<br />#define MAMAN_BAR(obj) (G_TYPE_CHECK_INSTANCE_CAST ((obj), MAMAN_TYPE_BAR, MamanBar))<br />#define MAMAN_IS_BAR(obj) (G_TYPE_CHECK_INSTANCE_TYPE ((obj), MAMAN_TYPE_BAR))<br />#define MAMAN_BAR_CLASS(klass) (G_TYPE_CHECK_CLASS_CAST ((klass), MAMAN_TYPE_BAR, MamanBarClass))<br />#define MAMAN_IS_BAR_CLASS(klass) (G_TYPE_CHECK_CLASS_TYPE ((klass), MAMAN_TYPE_BAR))<br />#define MAMAN_BAR_GET_CLASS(obj) (G_TYPE_INSTANCE_GET_CLASS ((obj), MAMAN_TYPE_BAR, MamanBarClass))<br /><br />typedef struct _MamanBar MamanBar;<br />typedef struct _MamanBarClass MamanBarClass;<br /><br />/* <br /> * Private instance fields <br /> * Uses the Pimpl method:<br /> *<br /> * http://www.gotw.ca/gotw/024.htm<br /> * http://www.gotw.ca/gotw/028.htm<br /> *<br /> */<br />typedef struct _MamanBarPrivate MamanBarPrivate;<br /><br /><br />/* object */<br />struct _MamanBar<br />{<br /> GObject parent_instance;<br /><br /> /* public */ <br /> int public_int;<br /><br /><br /> /*< private >*/ <br /> MamanBarPrivate *priv;<br />};<br /><br />/* class */<br />struct _MamanBarClass<br />{<br /> GObjectClass parent_class;<br /><br /> /* class members */<br /> <br /> /* Virtual public method */<br /> void (*do_action_virt) (MamanBar *self, gchar *msg);<br /><br />};<br /><br /><br />/*<br /> * Non-virtual public method<br /> */<br />void maman_bar_do_action (MamanBar *self, gchar *msg /*, other params */);<br /><br />/* Virtual method call declaration */<br />void maman_bar_do_action_virt (MamanBar *self, gchar *msg /*, other params */);<br />/* Virtual method default 'super' class method */<br />void maman_bar_do_action_virt_default (MamanBar *self, gchar *msg);<br /><br /><br />#endif /* __MAMAN_BAR_H__ */<br /></pre><br /><br /><b>maman-bar.c</b><br /><pre style="overflow:auto;margin-top:1em"><br />#include "maman-bar.h"<br /><br />/*<br /> http://library.gnome.org/devel/gobject/2.21/gobject-Type-Information.html#G-DEFINE-TYPE--CAPS<br /><br /> A convenience macro for type implementations, which declares a class <br /> initialization function, an instance initialization function (see GTypeInfo<br /> for information about these) and a static variable named t_n_parent_class <br /> pointing to the parent class. Furthermore, it defines a *_get_type() <br /> function. See G_DEFINE_TYPE_EXTENDED() for an example.<br />*/<br />G_DEFINE_TYPE (MamanBar, maman_bar, G_TYPE_OBJECT);<br /><br /><br />/* Define the private structure in the .c file */<br />#define MAMAN_BAR_GET_PRIVATE(obj) (G_TYPE_INSTANCE_GET_PRIVATE ((obj), MAMAN_TYPE_BAR, MamanBarPrivate))<br /><br />struct _MamanBarPrivate<br />{<br /> int hsize;<br /> gchar *msg;<br />};<br /><br /><br />/* Init functions */<br />static void<br />maman_bar_class_init (MamanBarClass *klass)<br />{<br /> g_type_class_add_private (klass, sizeof (MamanBarPrivate));<br /><br /> /* Setup the default handler for virtual method */<br /> klass->do_action_virt = maman_bar_do_action_virt_default;<br />}<br /><br /><br />static void<br />maman_bar_init (MamanBar *self)<br />{<br /> <br /> g_print("maman_bar_init() - init object\n");<br /> <br /><br /> /* Initialize all public and private members to reasonable default values. */<br /> <br /> /* Initialize public fields */<br /> self->public_int = 99;<br /><br /> g_print(" initializing public_int to %d\n", self->public_int);<br /> <br /><br /> /* Initialize private fields */<br /> MamanBarPrivate *priv;<br /> self->priv = priv = MAMAN_BAR_GET_PRIVATE(self);<br /> priv->hsize = 42;<br /><br /> g_print(" init'd private variable priv->hsize to %d\n", priv->hsize);<br /><br /><br /> /* If you need specific construction properties to complete initialization,<br /> * delay initialization completion until the property is set. <br /> */<br /><br />}<br /><br /><br />/* Object non-virtual method */<br />void maman_bar_do_action (MamanBar *self, gchar *msg) {<br /> /* First test that 'self' is of the correct type */<br /> g_return_if_fail (MAMAN_IS_BAR (self));<br /><br /><br /> // Assign to private 'msg' <br /> self->priv->msg = msg;<br /><br /> g_print("maman_bar_do_action() - %s\n", self->priv->msg);<br /><br />}<br /><br />/* Object virtual method call - performs the override */<br />void maman_bar_do_action_virt (MamanBar *self, gchar *msg) {<br /> /* First test that 'self' is of the correct type */<br /> g_return_if_fail (MAMAN_IS_BAR (self));<br /><br /> g_print("maman_bar_do_action_virt() -> ");<br /> MAMAN_BAR_GET_CLASS (self)->do_action_virt(self, msg); <br />}<br /><br />/* Object virtual method default action (can be overridden) */<br />void maman_bar_do_action_virt_default (MamanBar *self, gchar *msg) {<br /><br /> g_print("maman_bar_do_action_virt_default() - %s\n", msg );<br /><br />}<br /><br />int<br />main (int argc, char *argv[])<br />{<br /> /*<br /> * Prior to any use of the type system, g_type_init() has to be called <br /> * to initialize the type system and assorted other code portions <br /> * (such as the various fundamental type implementations or the signal <br /> * system).<br /> */<br /> g_type_init();<br /><br /> /* Create our object */<br /> MamanBar *bar = g_object_new (MAMAN_TYPE_BAR, NULL);<br /><br /> bar->public_int +=1;<br /> g_print("incremented bar->public_int: %d\n", bar->public_int);<br /><br /> /* Call object method */<br /> maman_bar_do_action(bar, "helowrld");<br /><br /> /* Call virtual object method - we could subclass and override... */<br /> maman_bar_do_action_virt(bar, "HELOWRLD");<br /><br /> return 0; <br />}<br /></pre><br />And here's what I get when I run <b>a.out</b>:<br /><pre><b>ok ./a.out</b><br />maman_bar_init() - init object<br /> initializing public_int to 99<br /> init'd private variable priv->hsize to 42<br />incremented bar->public_int: 100<br />maman_bar_do_action() - helowrld<br />maman_bar_do_action_virt() -> maman_bar_do_action_virt_default() - HELOWRLD<br /></pre>You can download the source files directly from <a href="http://www.arbingersys.com/dnlds/gobject-sample.tar.gz">here</a>.Unknownnoreply@blogger.com9tag:blogger.com,1999:blog-4678983171170252155.post-3096521644275959312009-05-16T11:28:00.001-07:002009-05-21T14:19:38.449-07:00New site designIf you've been to the site before, you can see we've changed our design. I decided that focusing on the blog portion made the best sense, because this is primarily a personal site with a few business items thrown in. Plus, I'm hoping to put a little energy back into the blog. Anyway, if things are missing or not consistent, it should get there, and hopefully soon.<br /><br />The sub-domain structure I used before is causing a little difficulty, however. Nothing major, but once you establish some links on the Internet other than on your own site, it's important that they are still accessible, especially if you don't want to cause frustration over the content that helps drive traffic to your site.<br /><br />So, here are a couple of blog posts that I get a pretty fair amount of traffic on:<br /><br /><a style="font-size:110%" href="http://arbingersys.blogspot.com/2008/04/google-app-engine-one-to-many-join_26.html">Google App Engine: One-to-many JOIN</a><br /><br /><a style="font-size:110%" href="http://arbingersys.blogspot.com/2008/04/google-app-engine-many-to-many-join_28.html">Google App Engine: Many-to-many JOIN</a><br /><br /><a style="font-size:110%" href="http://arbingersys.blogspot.com/2008/04/google-app-engine-better-many-to-many_30.html">Google App Engine: [A better] Many-to-many JOIN</a>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-8718757819490418962008-06-10T09:59:00.001-07:002009-05-21T13:33:18.174-07:00Plake: Morph a File Based on Targets<a href="http://www.arbingersys.com/plake.html"><img src="http://www.arbingersys.com/images/plake.png" alt="Plake" border="0" /></a><br />This blog was always intended as a means to talk about projects I'm working on, as well as a way to voice my opinions to the world. So far, it's been largely skewed to the latter.<br /><br />So, I'd like to talk about a little build tool I've written called <a href="http://www.arbingersys.com/plake.html">Plake</a>. In a nutshell:<br /><blockquote>Plake is a tool that allows you to maintain sections within a single file (usually, variations of the same code/markup/content) and then assemble variations of that file according to which target you call. It was inspired by Make, <span style="font-weight: bold;">can be used in conjunction with Make</span>, and is written in Perl, hence the name "Plake".</blockquote><a href="http://www.gnu.org/software/make/">Make</a> is a nearly ubiquitous build tool. It's used in countless software projects and is even the basis of the CPAN installer that's part of any Perl distribution --<br /><pre class="code">perl -MCPAN -e shell</pre>Make does a really simple, powerful thing. It sets up rules (aka targets) that execute commands or invokes other targets, which is known as dependency chaining. From these rather simple concepts, you are able to orient a project for different variations, <span style="font-weight: bold;">nicely denoted by a single target name</span>.<br /><br />For example, you might type<br /><br /><code>make linux_build</code><br /><br />to build a Linux platform binary, <span style="font-weight: bold;">which may consist of <span style="font-style: italic;">X</span> number of steps that must execute in a certain order</span>. Or, you might say<br /><br /><code>make apache_modperl</code><br /><br />to include files from your web application specifically for an Apache/mod_perl web environment, along with the more general non-platform specific files.<br /><br /><span style="font-weight: bold;">What Make can't do</span>, however, is snag bits of code (or markup) from individual files for a given build. If you've ever looked at cross-platfrom C/C++ code, you've probably noticed the <code>#ifdef</code> directives in the header files. These are used because sometimes there are small portions of code that need to be excluded when compiling for a certain platform or target, and keeping totally separate files to accommodate this is excessive.<br /><br />Plake allows you to define sections within a single file, and then "assemble" only the sections you want at build time. Here's an example.<blockquote>Let's say you have a C++ source file that gets built for the Windows platform and also for Linux. Keep the differences as sections in a single Plake file. Then when you assemble the .cpp file for the given platform, it only contains that platform's code.<br /><br />The following commands both produce "myfile.cpp" (but possibly at different folder locations) with only the code that each platform needs:<br /><br /><code>plake file=myfile.plk target=windows_build</code><br /><br /><code>plake file=myfile.plk target=linux_build</code></blockquote>Because Make is generally made up of shell commands, you would put the above commands under the appropriate Make target, and when you type <span style="font-weight: bold;">make <span style="font-style: italic;">target</span></span>, Plake assembles the file <span style="font-weight: bold;">with only the parts you need</span> prior to compiling it. The advantage you get, in the scenario above, is that reviewing code is easier, <span style="font-weight: bold;">since after a specific target is assembled, only the code you need to see is there</span>.<br /><br />There are some other uses for Plake, which I've discussed over at <span style="font-style: italic;">Perlmonks</span>, <a href="http://www.perlmonks.org/?node_id=678202">here</a> and <a href="http://www.perlmonks.org/?node=670323">here</a>. This is the short list:<br /><ul><li><span style="font-weight: bold;">Setting variations for builds</span>. A convenience for me since I have yet to implement a more complex (i.e. overrides) configuration system, but still have to make subtle changes (usually, by hand-editing) for various implementations at various stages of development.</li><li><span style="font-weight: bold;">Assemble C/C++ files for specific platforms</span>, in the stead of <code>#ifdef</code>, etc. The resulting .c/.cpp/.h file would be assembled dynamically when the project was <code>make</code>'d for a given platform, just prior to compilation. The code generated for that platform would be a bit simpler to review, since it only includes code that a person cares about in that build.</li><li><span style="font-weight: bold;">Remove experimental features</span>, stubs, or extra debugging from code prior to generating distros, i.e. "Cleanup".</li><li><span style="font-weight: bold;">Branching, like what source control does</span>. You could keep some client or "branch" specific features out of a specific build, but still maintain it in a single file.</li><li><span style="font-weight: bold;">Template variations</span>, like letter writing. Instead of a single boiler plate template, you have targets like "standard_greeting", "enthusiastic_greeting", "familiar_greeting", etc.</li><li><span style="font-weight: bold;">Target-based programming for Perl</span>. Sort of a side-effect, and one I don't see all the ramifications of, but you could use Plake to assemble code targets wholly or partially independent of each other by storing Perl code in a Plake file and doing an <code>eval</code> against the assembled content for a given target. (<span style="font-style: italic;">Just think -- you could keep your entire project of hundreds of modules and code files all in one single, massive text file! I can see everyone lining up now...</span>)</li></ul> The last item above, <span style="font-weight: bold;">target-based programming</span>, is particularly interesting, I think, so I'll cover it briefly before finishing up. Plake was written in Perl, and uses the <code>eval()</code> function to execute code on the fly. With a minimal change in the code, you could take the content you return from the <span style="font-style: italic;">plk</span> file and <code>eval()</code> it, effectively creating a <span style="font-weight: bold;">target-based interpreter</span>. (I include a sample that does this in the download. See <span style="font-style: italic;">plakeval.pl</span>.)<br /><br />So, if you have a Plake file like<pre class="code">!plake:<br /><br />target('helowrld', "helowrld", '');<br />target('oneplus', "oneplus", '');<br />target('both', "helowrld oneplus", '');<br /><br />!plake helowrld<br />print "helowrld\n";<br /><br />!plake oneplus:<br /># Add value to one<br />print 1+3.14, "\n";</pre>and you called it with <span style="font-style: italic;">plakeval.pl</span>, you would get the following:<pre class="code">perl t\plakeval.pl file="t/plakeval.plk" target="helowrld"<br />helowrld<br /><br /><br />perl t\plakeval.pl file="t/plakeval.plk" target="oneplus"<br />4.14<br /><br /><br />perl t\plakeval.pl file="t/plakeval.plk" target="both"<br />helowrld<br />4.14</pre>When the target <span style="font-weight: bold; font-style: italic;">both</span> is called, you can see that we are printing <span style="font-style: italic;">helowrld</span> and also adding <span style="font-style: italic;">3.14+1</span>.<br /><br />What this means is that you can stick things together in a file that perhaps make sense in a certain context, but wouldn't otherwise. Like I said, target-based programming is sort of a side-effect, and while I haven't really explored its value, I have a sense that some exists. At any rate, I find it interesting.<br /><br />But really, Plake was designed to let you keep variations of a file in a <span style="font-weight: bold;">single actual file</span> on the hard drive, and then omit or include parts of it based on a target. And it does that really well. I use it in my own projects and it saves me a considerable amount of error-prone work.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-69293309700863243892008-05-30T08:00:00.001-07:002009-09-14T16:41:18.464-07:00I'm Trying To Quit... Commercial Software Pt. 2<img src="http://www.arbingersys.com/blog/images/tryquit-ico.png" alt="Trying to quit" align="left" /><span style="font-style: italic;">In this experiment, FOSS is effectively graded on whether or not it can substitute all or most of my proprietary software needs, without me having to substantially change the way I use software. It's highly subjective, and human nature, like laziness and apathy, is very much a part of it, as you will see.</span><br />
<br />
This is the second installment of my personal Free Open Source Software experiment. Read the first installment <a href="http://www.arbingersys.com/20080415/i-m-trying-to-quit-commercial-software-pt-1.html">here</a>.<br />
<br />
Within a year of getting my new notebook, my wife's laptop gave up the ghost. It was a Dell Inspiron 8100, and frankly, we'd gotten our money's worth. I purchased a new laptop, a Gateway M6882, and we did the laptop shuffle again.<br />
<br />
The Gateway came with Vista, but I wanted to run XP. I immediately discovered that XP was going to be difficult to manage. There was no floppy drive, XP didn't have the needed SATA controller, and there were only three hardware drivers available for XP on the Gateway site.<br />
<br />
After thinking about it, I realized that regardless of my feelings for Vista, it's going to be inevitable, and I might as well get used to it. However, I'm resentful about my conclusion, and I'm sure I'm not the only one. <span style="font-weight: bold;">As far as portents go, this is a bad one for Microsoft</span>.<br />
<br />
Ultimately, this ended up being a good thing. I'd been wanting an excuse to run Linux, and here it was. I decided to keep Vista, since I might need it, but repartition and dual-boot Linux.<br />
<br />
<span style="font-weight: bold;">Thus began the second phase of my "experiment"</span>. I would see just how little I'd have to use Vista, if Linux were available instead.<br />
<h4 class="arb-subhead">Linux</h4>I started with Ubuntu 7.10 LTS, since it seemed like the distro with the most momentum. Installation was a breeze. I particularly like booting from the CD and getting to play around with the desktop before doing the install.<br />
<br />
After installation, however, I began to bump into oddities and frustrations.<br />
<br />
First, the M6882 is a widescreen with an optimal resolution of 1280x800. The Gnome desktop <span style="font-weight: bold;">used the entire screen</span>, but the top and bottom system bars only went to a width of 1024 pixels. I tried to change the resolution using the system config tools, but nothing worked. I had to hit the forums, and after some time (longer than I would have preferred), finally found a solution that involved editing the <span style="font-style: italic;">xorg.conf</span> file. I still don't understand exactly what I changed, but it had something to do with <span style="font-style: italic;">TV out</span> settings.<br />
<br />
This gave a bad impression. The facts of life are that in the many <span style="font-style: italic;">many </span>installs I've done of Windows, I've never had to do this much work to get the system to the correct screen resolution.<br />
<br />
I still had one other hardware problem that was bothering me. The sound card didn't work. <span style="font-weight: bold;">This took even longer to fix than the screen resolution, and was twice as painful</span>.<br />
<br />
I hit the forums again. I tried several suggestions with rather involved steps, with no success. I had a glimmer of hope when I found and downloaded the Linux drivers from the manufacturer's website. It was a source package, with some simple instructions for compiling and installing. But the install script first removed the existing sound libraries that the X server had been compiled against, using the fatal <span style="font-style: italic;">rm</span> command. Then, the build failed. Unaware of what had happened, I gave up and at some point rebooted. <span style="font-weight: bold;">The desktop failed to load</span><span style="font-weight: bold;"> the next time I tried to boot</span>.<br />
<br />
The manufacturer's Linux driver package had clobbered my non-working, but non-failing sound libraries without backing them up, or even checking that the build succeeded first. At this point I was pretty much hosed, and the easiest thing to do was to reinstall.<br />
<br />
I reinstalled, fixed the screen resolution problem again, and still didn't have sound. I finally found a solution, on some guy's blog. There was no compiling required, just a bunch of funky steps to get a "backports" package installed, after which I had to re-run some updates I'd already done. After that, my sound worked fine. But, like the screen, <span style="font-weight: bold;">this was far too much work to have to do for something I consider basic and essential to an OS</span>.<br />
<br />
The next hassle I had was that I changed my password, and suddenly was being prompted by the keyring manager every time I logged in. Again, my only resource was the forums. I'll spare all the details of resolving this problem, but I'll say this: the problem with forums as the help is that you don't know who you can believe. I'm not saying anyone would attempt to purposely mislead you (although they might), but they can and often do get things wrong, communicate the solution poorly, or miss a detail that is essential to your particular system.<br />
<br />
In the keyring case, I followed one person's advice, which involved compiling from source, and began the descent down the dependency <span style="font-style: italic;">Inferno</span>, only to find out that all I really needed was to run the following simple command:<br />
<br />
rm ~/.gnome2/keyrings/login.keyring<br />
<br />
<span style="font-weight: bold;">Using the community forums as the help system is a problematic solution at best</span>. With no monetary incentive, you get only the best someone is willing to offer at the time, you have no verification of the expertise of your source, and no one is responsible. You may get an excellent answer, a partial answer, the wrong answer, or no answer.<br />
<h4 class="arb-subhead">Never booting into Vista</h4>After getting past the problems above, I began using Linux in earnest. As far as the basic things I need to do on a computer, e.g. programming, web surfing, email, FTP, document editing, spreadsheets, playing music, etc, Ubuntu was able to deliver.<br />
<br />
But here's what I still need Vista for:<br />
<br />
<span style="font-weight: bold;">DVD playback</span>. I couldn't play a DVD of <span style="font-style: italic;">24</span> with Totem. I had installed GStreamer <span style="font-style: italic;">the ugly</span> and also Mplayer. No dice. Mplayer looked like:<br />
<br />
<img src="http://www.arbingersys.com/blog/images/mplayer24.png" alt="mplayer fails" /><br />
<br />
I also tried VLC. It got some images to the screen, errored out, and froze.<br />
<br />
I didn't give up that easily. Next I installed Totem with the xine backend. When I played the DVD this time, I got the FBI warnings, but it complained about encryption when it came to the video, and also failed.<br />
<br />
In Vista all I have is the Windows Media Center, which sucked in XP. It's been improved, and other than the audio being slightly lower than I would have liked (perhaps a hardware issue), I can play DVDs without a headache.<br />
<br />
<span style="font-weight: bold;">Photoshop</span>. I know I could learn Gimp, but I already know Photoshop, and know it well. It had a steep learning curve, and has all the capabilities I need and then some, so switching doesn't appeal to me. I'd much rather just boot into the system where this app runs and use it there.<br />
<br />
<span style="font-weight: bold;">Doom9.net</span>. I use a lot of the multimedia tools (e.g. BeSplit, MeGUI) available from this site. Most of these interfaces, while freeware, run on Windows.<br />
<br />
<span style="font-weight: bold;">Netflix</span>. Sorry, but they have that <span style="font-style: italic;">Watch Instantly</span> feature, which will not only just run on Windows, but also will only run on <span style="font-style: italic;">X</span> number of installs of Windows. I don't like it, but like Vista, it's just the way things are.<br />
<h4 class="arb-subhead">Rooting for Linux</h4>While I'm growing increasingly fond of Linux, <span style="font-weight: bold;">and certainly rooting for it</span>, it's got a ways to go. Hardware will be a weak point for some time to come. This isn't the fault of Linux, but instead the fault of economics. Money is the big <span style="font-style: italic;">incentivizer</span>, and the OS that can bring in the most money will always get priority. My experience with the manufacturer's sound driver installation is a clear example of this.<br />
<br />
Microsoft may not win any medals for its ideals, but sound drivers usually install without the user having to jump through hoops or inadvertently clobbering their system, and I can play most DVDs by just slapping it in the drive.<br />
<br />
Linux also suffers in the support department. Again, this is because the model of Linux is essentially based on altruism. Really, it's an amazing feat that Linux works as well as it does, has the support it has, and is as advanced as it has become. I'm rapidly turning into a fan, and have optimism for the future.<br />
<br />
<span style="font-style: italic;">Watch for my next installment, in which I begin to play around with Gimp, surprisingly, because of laziness, switch to openSUSE and am pretty happy with it, and have some trouble connecting to WIFI where Vista does not.</span>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-77235403202959012522008-05-19T10:41:00.001-07:002009-05-21T13:33:18.138-07:00What's a Wiki?<img src="http://www.arbingersys.com/blog/images/wiki.gif" alt="Wiki" align="left" />Not a particularly hard question, and most people (whose primary exposure to the term is through Wikipedia), will pipe up: It's a website that lets anyone edit and make changes. And they'd be right, but there's more to it.<br /><br />A Wiki was originally designed around the philosophy of <span style="font-weight: bold;">incompleteness</span> and <span style="font-weight: bold;">interaction</span>. The concept, created by <a href="http://en.wikipedia.org/wiki/Ward_Cunningham">Ward Cunningham</a> was intended to foster <a style="font-style: italic;" href="http://c2.com/cgi/wiki?ContentCreationWiki">collaboration [which] creates and develops new ideas</a>.<br /><br />But it's extremely difficult to know just exactly how your idea will be adapted and ultimately play out when presented to the world.<br /><br />Wikipedia came along and decided to<span style="font-weight: bold;"> classify content</span> using a Wiki, becoming the world's first collaborative encyclopedia. And it <span style="font-style: italic;">has</span> stayed true to the ideas above. It's both incomplete and promotes interaction. But it doesn't use Wiki technology to <span style="font-weight: bold;">develop new ideas</span>. That's not why it was created.<br /><br />Is Wikipedia any less a <span style="font-style: italic;">Wiki</span>, then? Not really. While the original intention of Wiki may have been to foster the creation of new ideas, the functionality it provides to do that (i.e. ease-of-use, simple markup, natural collaboration) lends itself to other goals as well.<br /><br />So then, a Wiki may be:<br /><br /><span style="font-weight: bold;">Content Creation Wiki</span><br />The original intention of a Wiki -- to collaborate and create new ideas. From <a href="http://c2.com/cgi/wiki?ContentCreationWiki">c2.com</a>:<br /><blockquote>Treat a page here as a half-finished piece of sidewalk art. Don't scuff it up. Don't rub it out. Don't write messages on it like "finish this you bum or I will scuff it" or "I disagree" or "me too".<br /><br />Instead, see if you can head it toward completeness. If you can't do that now, leave it be. Maybe one day you will think of something to add. Or perhaps another will. We rely on each other to help new things come into being, like ants building nests.</blockquote><span style="font-weight: bold;">Content Classification Wiki</span><br />Sites like Wikipedia, which classify existing knowledge to make it usable. These sites tend to be larger, edited more stringently, and try to present knowledge "authoritatively". [<a href="http://c2.com/cgi/wiki?ContentClassificationWiki">link</a>]<br /><br /><span style="font-weight: bold;">Knowledge Base Wiki</span><br />I'm adding this one, since I'm increasingly seeing Wikis used this way. This type tends to be specific to organizations, and are either used to accumulate and distribute information about a specific product or service, or used internally to collaborate and share information, e.g. company policies, inter-departmental information, etc.<br /><br />These Wiki "types" really only differ in their intention and audience. They all foster collaboration, are simple to use, and are generally ongoing, with no real <span style="font-style: italic;">finalization </span>date.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-15947599902467420982008-05-13T09:51:00.000-07:002010-05-05T10:01:22.432-07:00Re: Why People Are Passionate About Perl<img src="http://www.arbingersys.com/blog/images/perlonion.png" alt="Perl" align="left" /> Here's my response to <a href="http://use.perl.org/%7Ebrian_d_foy/">brian_d_foy</a>'s <a href="http://use.perl.org/%7Ebrian_d_foy/journal/36356" style="color: rgb(255, 102, 34);"><b>People Passionate About Perl</b> meme</a>.<br />
<br />
<b>I first starting using Perl to...<br />
</b>I began looking into Perl in the 90s -- when it was suffering less from perception issues -- as an alternative web development platform to ASP. ASP presented a low bar, and I was making web front-ends to databases in a very short time. Then, after satisfying initial needs, more demands began to be made on our web applications, and ASP's low bar began to be an inhibitor.<br />
<br />
I reviewed Perl as an alternative, and (I'll be honest) after getting past the syntax, began to understand the power I was toying with. Then, I discovered <a href="http://www.cpan.org/">CPAN</a>. ASP never looked quite the same after that.<br />
<b><br />
</b><b>I kept using Perl because...<br />
</b>It's never given me a reason not to. Sorry, but I'm not loyal for loyalty's sake. If a tool like Perl can't make my life easier than tool <span style="font-style: italic;">X</span>, then it's time to investigate tool <span style="font-style: italic;">X</span>.<br />
<br />
But Perl hasn't failed on this account. It's proven to be highly adaptable, and the energy of its community has fit it to new paradigms readily. For instance, is there a <span style="font-weight: bold;">Ruby on Rails</span> for Perl? Try <a href="http://www.catalystframework.org/">Catalyst</a>, or the newer <a href="http://jifty.org/view/HomePage">Jifty</a>.<br />
<br />
As for pre-packaged functionality, I don't think there's a language that can compete. CPAN continues to <a href="http://www.perlmonks.org/?node_id=659849">grow and grow</a>. In fact, if you want to contribute, the difficulty now is thinking up something that hasn't already been done. <a href="http://www.arbingersys.com/2008/03/reverse-callback-templating_19.html">Try templates, for example</a>.<br />
<br />
This means one thing: If I need a tool to get something done, Perl is the easiest choice. It's powerful, flexible, and continues to edge into functionality that I haven't even begun to think about.<br />
<br />
Oh yeah, and it also provides industry leading regular expressions via operator, for the absolutely most convenient and shortest possible way of using this very important technology.<span style="text-decoration: underline;"></span><br />
<br />
<b>I can't stop thinking about Perl...<br />
</b>Actually, I can stop thinking about Perl, and frequently do. That's because there are <a href="http://www.arbingersys.com/2007/12/mr-spolsky-and-work-is-life-principle_08.html">other things in my life</a> besides Perl. However, <span style="font-weight: bold;">I think in Perl</span> when I think about crafting software, or anything abstract and computational. Its natural language model makes this easy.<br />
<br />
And since web, Internet, and database are the spaces for the majority of my software ideas, thinking in Perl is a huge benefit for me, because so many others are thinking in Perl for the same spaces, answering questions I haven't thought to ask yet (CPAN again).<b><br />
<br />
</b><b>I'm still using Perl because...<br />
</b>This is mostly covered above. But here's one more.<br />
<br />
Line count. I use Perl day-to-day to handle any number of tasks, of any size and importance. Perl itself reduces line count just in the power of its syntax. I'm not talking about merely writing obfuscated code. I'm talking out the <a href="http://www.arbingersys.com/2007/11/is-brevity-soul-of-wit_27.html">power inherent in the language itself</a>.<br />
<br />
And now, back to CPAN.<br />
<br />
Recently at work we needed to parse a handful of Excel spreadsheets that were formatted <span style="font-style: italic;">more or less</span> the same. I handed this job off to a contractor who works for me. He created a C# project, and then left for the day. He wasn't able to come back the next day, so I took the project over. He had barely gotten started, but he already had five or six files involved, and <span style="font-weight: bold;">a couple hundred lines of code</span>.<br />
<br />
I immediately thought we should be doing it in Perl. This was a one-off project, so why do a whole Visual Studio project? I Googled around and found <a href="http://www.ibm.com/developerworks/linux/library/l-pexcel/">this tutorial</a>. I installed the modules from CPAN, adapted the samples to my needs, and about an hour and <span style="font-weight: bold;">80 lines of code</span> later, I had the spreadsheets munged into SQL and ready to go.<br />
<br />
Sorry, but I'll take the shorter route every time, if I can.<br />
<br />
<b>I get other people to use Perl by...<br />
</b>Well, I blog. Not exclusively about Perl, and not even explicitly to advocate Perl, but it <span style="font-style: italic;">is </span>about Perl, because, like I said before, I think in Perl. It's going to leak out.<br />
<br />
I have pointed people to Perl when it best suits their needs. A guy I work with wants to learn programming, and was looking at Python. I asked why he was interested in programming, and he admitted he just wanted to write a few scripts to download content off a website. I nodded and said, "You should use Perl."<br />
<br />
Python may have this covered as well, but I showed him how in one line of code (via LWP::Simple) I could grab the text of a website. I also pointed him to all the modules -- you guessed it, available on CPAN -- that can <a href="http://search.cpan.org/dist/HTML-Parser/Parser.pm">rip apart HTML</a> and extract just the things you need.<br />
<br />
<b>I also program in<nobr> <wbr></nobr>... and<nobr> <wbr></nobr>..., but I like Perl better since...<br />
</b>Although I know several other languages, I program primarily in C# and Perl.<br />
<br />
Both languages work well for the domains in which I use them. I use C# to write Windows specific applications. C# just has better hooks into the system, with less weirdness.<br />
<br />
I use Perl for pretty much everything else. And where they cross domains, i.e. web development (ASP.NET), I prefer Perl, because (1) it has more pluggable functionality, and for free, and (2) has a shorter development-to-production time. This is partly due to my proficiency in Perl, but also because there is less setup involved in new projects, and less OO wrapping.<br />
<br />
If it's a simple app, I use a minimal amount of Perl. If it's more complex, I use the frameworks available, like CGI::Application and templates. With ASP.NET, you're pretty much bound to the framework -- with all its complexity -- even for simple projects.<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4678983171170252155-3150863016843328099?l=arbingersys.blogspot.com' alt='' /></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-63897801081716447922008-05-08T19:13:00.001-07:002009-05-21T13:33:18.208-07:00Google Docs Finally Matter To Me<img src="http://www.google.com/google-d-s/images/docslogo.gif" alt="Google Docs" align="left" />To be honest, online documents never really were a big sell for me. Frankly, it's an application space that's pretty boring, and ultimately, you sacrifice functionality. What functionality do you get with Google Docs that an "offline" word processor can't provide in spades? Well, just this: Your documents can be accessed and edited from anywhere <span style="font-weight: bold;">[that you have a high-speed Internet connection]</span>. That last part is mine, since you won't see that in any marketing phrase for an online word processor. But it's significant.<br /><br />It may surprise you, but until recently I only had a dial-up connection at home. Because I live in a rural area outside the city, the only option for me was satellite, which I didn't find appealing due to the cost/performance ratio. At work, however, we have DS3, so I had no real Internet deficiency.<br /><br />My work also provided me with a 4GB thumb drive -- and lanyard! -- so I had an extended sneakernet, and anything that was too painful to download from home (almost everything), I would download at work. Also, document synchronization between work and home was answered by simply keeping documents on the thumb drive. This guaranteed that wherever the location, I always had the up-to-date revision.<br /><br />So because I had <span style="font-weight: bold;">only one</span> high-speed Internet connection, the single advantage Google Docs could provide over <span style="font-style: italic;">Word</span> or <span style="font-style: italic;">OpenOffice Writer</span> didn't exist.<br /><br />And let's be honest. The interface, while well done for a web app, doesn't compare to a locally running application written for your platform. What is Google Docs, really? It's a <span style="font-weight: bold;">word processor running under a web browser</span>. What is Microsoft Word? <span style="font-weight: bold;">It's just a word processor</span>. Which program do you think is going to be better suited to the task of word processing, and capable of offering more power? The one that gets to focus its logic on word processing, or the one that also has to be a web browser? <span style="font-weight: bold;">Google Docs only has an advantage as a web platform</span>.<br /><br />When DSL became available in my area, the game changed. I now have a fast, always on connection at the two places where I do most of my work: at home and at my office. My sneakernet has pretty much ceased to exist. If I need to transfer anything I simply use FTP, email, or VPN.<br /><br />But all of the above methods are kind of clunky for synchronizing files. Our VPN only works with Microsoft clients, unfortunately, and I use Linux quite often when I'm home. FTP would work the best, but there are a lot of extra steps (or extra setup) when compared to just plugging in a thumb drive and clicking on the file you want to edit.<br /><br />Many of the documents I work on are spec documents for software projects. I don't really need anything more than just basic word processing functionality: headings, emphasis, bulleted lists, tables. Google Docs does all this pretty well.<br /><br />I recently started to develop a spec for a Perl library, and made this my first real try of Google Docs <span>now that I have </span><span style="font-weight: bold;">more than one </span><span>reliable high-speed Internet connection</span>. I started writing the spec about a half hour before I left work one day. On the way home, I had some new ideas, and wanted to add them while they were still fresh. <span style="font-weight: bold;">This was the moment Google Docs finally began to matter to me</span>. It was the easiest synchronized document edit I had made to date. I just logged in to my laptop when I got home, made my additions, and saved.<br /><br />Since then, any document that I'll edit from more than one location goes directly to Google Docs.<br /><br />Google Docs, or any online word processor, only has real value as a web platform. And a web platform only has value where there is a sufficiently high-speed Internet connection available. As <span style="font-style: italic;">that</span> becomes more and more common, online word processing will begin to matter to more people.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-42362240565828355842008-05-06T10:13:00.001-07:002009-09-14T09:58:06.636-07:00Keeping A Digital Diary On A TreoAbout a year and a half ago, my wife and I took a trip to Mazatlan, Mexico. In my backpack, along with my laptop, I had stowed a <span style="font-weight: bold;">Siemens SX56 PDA</span>. This was the first time we had ever visited Mexico, so I decided I wanted to keep a day-by-day account.<br />
<br />
The SX56, like most PDAs, has a microphone. I changed the recording settings to a low frequency -- 8 kHz 8 bit stereo (still good enough for voice recording) -- and recorded the events of our Mexico vacation. <span style="font-weight: bold;">Since then I've maintained a personal audio diary on my PDA</span>, trying to put something in for each day without being bogged down by boring minutia. Of which, sadly, there is enough.<br />
<br />
The SX56 has only 32MB of storage, and part of that is used for system files. I found myself filling it up and having to dump to my laptop far too frequently. It turns out this was hardly an insurmountable problem.<br />
<br />
I bought a Sandisk 256MB flash card, and switched the voice recorder to save to it automatically. <span style="font-weight: bold;">This solved two annoyances at once</span>: I didn't have to dump the voice recording files as frequently to my laptop, and I no longer needed to sync via cable, which is always a pain. I could just plug the flash card into my laptop move the files off with Explorer.<br />
<br />
For a long time, this was a very workable solution. It still would be, in fact, but by a small stroke of fortune, I was able to upgrade to a Treo. Here's a picture:<br />
<div style="width: 402px; font-size: 85%; margin-left: 2em;"><br />
<img src="http://www.arbingersys.com/blog/images/treo.jpg" alt="Treo Digital Diary" /><br />
The journey of a digital photo: This picture was taken on my wife's Nokia cell phone, emailed from there to my Gmail account, download to my laptop, cropped using Gimp, and then FTP'd to my website.<br />
</div><br />
Not very long ago, a guy I work with brought in a box of about thirty Treo phones like the one I snagged above. He had gotten them from his old employer who no longer needed them since they had just gotten a new budget. <span style="font-style: italic;">(Aren't they the lucky ones...)</span><br />
<br />
After playing with one for a while, I decided I'd take him up on his offer of having one for free. It had all the features of the SX56, and then some. Like twice the storage space on the phone itself. And a real keyboard and navigation button, instead of purely on-screen controls. And of course, my 256MB flash card plugs right in.<br />
<br />
Also, it has a camera. This didn't seem that significant at first, but we were recently on a hike, and I was able to take a picture of my wife and daughter, and save it on the flash card along with my diary's audio files. So now my diary has taken on a whole new dimension: <span style="font-weight: bold;">It will include real as well as audio imagery</span>.<br />
<br />
This isn't the first time I've tried to keep a diary. A couple times in the past I was inspired to do it, and each time, it fizzled. The reason my PDA diary hasn't, I think, is because it lends itself so well to the task. It's portable, by which I mean it has a battery and fits in your pocket, and it requires little effort -- just click and talk about the day's events.<br />
<br />
Really, the hardest thing at this point is making sure that you only record things that are actually interesting. You don't want to bore your future self, after all.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-43716667265119790552008-05-03T10:15:00.001-07:002009-05-21T13:33:18.238-07:00ScratchPad MX - Save Stuff For Later<img src="http://www.arbingersys.com/images/scratchpadico.png" align="left" /> For the longest time, I've had a shortcut on my Start menu that launched a text document called <span style="font-style: italic;">scratch.txt</span>. This way, with a few clicks, I could save something I might need later, or if I needed a place to temporarily stick some clipboard stuff, I could use it for that. But the problem was, I didn't need a full-blown editor (or even half-blown, like Notepad) to do this. I wanted something that was editor-<span style="font-style: italic;">like</span>, but <span style="font-weight: bold;">stripped down to and streamlined for just the functions I needed</span>. A real scratch pad, not an editor acting like one.<br /><br />Specifically, I wanted the following:<br /><ul><li><span style="font-weight: bold;">Key combo launch</span>, so it would be available at a moment's notice.</li><li><span style="font-weight: bold;">A command to create a section</span> (insert a divider of some kind) to keep stuff separate from each other.</li><li><span style="font-weight: bold;">Save and close with a single command</span>, for when I need to save something quickly until I have time to think about it.</li><li>After a while, you'll accumulate an eclectic mix of stuff, so a way to <span style="font-weight: bold;">jump from section to section</span>. (Also, a way to search.)</li><li><span style="font-weight: bold;">Close without saving</span>, for when I'm just using it to store something temporarily, like when I'm <span style="font-style: italic;">clipboarding</span> heavily.<br /></li></ul>So that's what I came up with. I call it <span style="font-weight: bold;">ScratchPad MX</span>. Here's what it looks like:<br /><br /><img src="http://www.arbingersys.com/images/scratchpadmx.png" alt="ScratchPad MX" /><br /><br />Along the top, you can see the five commands available to the program. Pretty self-explanatory. <span style="font-weight: bold;">Ctrl+f</span> will jump through each section, which is defined by the line of "=" characters. If you type some text and highlight it, <span style="font-weight: bold;">Ctrl+f</span> will search down through the document for that text. (So actually, there are 5.5 commands.)<br /><br />You can <span style="font-weight: bold;">download </span>the installer <span style="text-decoration: underline;"></span><a href="http://www.arbingersys.com/dnlds/scratchpad_mx.exe">here</a>. It will automatically install the program, and optionally, you can install a hotkey, <span style="font-weight: bold;">WinKey+Space</span> that you can use to bring ScratchPad MX up instantly.<br /><br />ScratchPad MX is <span style="font-weight: bold;">completely free</span>. Also, this is version 0.01 (<span style="font-style: italic;">barely better than beta</span>), so if there are problems or you think a feature might be useful, either leave me a comment, or email me at one of the addresses on my <a href="http://www.arbingersys.com/">home page</a>.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-24895162144696552382008-04-30T08:00:00.001-07:002009-10-03T12:50:35.820-07:00Google App Engine: [A Better] Many-to-many JOIN<img src="http://www.arbingersys.com/blog/images/gae.jpg" alt="GAE" align="left" /><span style="background-color: rgb(255, 255, 187);">(This is a follow-up to my original post <a href="http://www.arbingersys.com/20080428/google-app-engine-many-to-many-join.html">GAE: Many-to-many JOIN</a>. It probably wouldn't hurt to read that first, since this post sort of assumes you have.)</span><br />
<br />
After getting some feedback on my original post, a simpler, more <span style="font-style: italic;">SQL analogous</span> way to obtain the many-to-many behavior was pointed out to me.<br />
<br />
I've created another sample (download it <a href="http://www.arbingersys.com/dnlds/gaemany2.tar.gz">here</a>), and will go over it below. Afterwards, I'll talk about why you <span style="font-weight: bold;">shouldn't</span> model your data this way, and instead should <span style="font-weight: bold;">denormalize your data</span> for optimization in the Datastore.<br />
<br />
Here are the new data Models. (The full code listing is <a href="http://blog.arbingersys.com/gaemany2_example.txt">here</a>.)<pre class="code">class Libraries(db.Model):
notes = db.StringProperty()
class Books(db.Model):
notes = db.StringProperty()
class Library(db.Model):
name = db.StringProperty()
address = db.StringProperty()
city = db.StringProperty()
libscol = db.ReferenceProperty(Libraries,
collection_name='libscol')
def books(self):
return (x.book for x in self.librarybook_set)
class Book(db.Model):
title = db.StringProperty()
author = db.StringProperty()
bookscol = db.ReferenceProperty(Books,
collection_name='bookscol')
def libraries(self):
return (x.library for x in self.librarybook_set)
class LibraryBook(db.Model):
library = db.ReferenceProperty(Library)
book = db.ReferenceProperty(Book)</pre>I still have the <code>Books</code> and <code>Libraries</code> models, as you can see. These are needed to collect the <code>Library</code> and <code>Book</code> entities so I can easily iterate over them and output. The <code>Book</code> model contains a reference to <code>Books</code>, via <code>Book.bookscol</code>, and <code>Library</code> to <code>Libraries</code>, via <code>Library.libscol</code>.<br />
<br />
The <code>LibraryBook</code> model just contains references to the <code>Library</code> and <code>Book</code> models. This creates our "join". After we add libraries and books to the Datastore, we will link them to each other using <code>LibraryBook</code> entities.<br />
<br />
When the page loads, we first create and store our data entities.<pre class="code"># Library collection
libs = Libraries()
libs.put()
# Book collection
books = Books()
books.put()
# Setup libraries
lib1 = Library(name='lib1', address='street a',
city='city1', libscol=libs)
lib2 = Library(name='lib2', address='street b',
city='city2', libscol=libs)
lib1.put()
lib2.put()
book1 = Book(title='book1', author='author one',
bookscol=books)
book1.put()
book2 = Book(title='book2', author='author one',
bookscol=books)
book2.put()
book3 = Book(title='book1', author='author two',
bookscol=books)
book3.put()
book4 = Book(title='book2', author='author two',
bookscol=books)
book4.put()
book5 = Book(title='book3', author='author two',
bookscol=books)
book5.put()
l1 = LibraryBook(library=lib1, book=book1)
l2 = LibraryBook(library=lib1, book=book2)
l3 = LibraryBook(library=lib1, book=book4)
l4 = LibraryBook(library=lib2, book=book4)
l5 = LibraryBook(library=lib2, book=book5)
l6 = LibraryBook(library=lib2, book=book3)
l7 = LibraryBook(library=lib2, book=book1)
l1.put()
l2.put()
l3.put()
l4.put()
l5.put()
l6.put()
l7.put()</pre>First, we create our <code>Libraries</code> and <code>Books</code> entities, <code>libs</code> and <code>books</code>. These will be passed into each <code>Library</code> and <code>Book</code> entity we create.<br />
<br />
After we create our books and libraries, we generate a lot of <code>LibraryBook</code> entities, assigning a library and a book to each one. Each <code>LibraryBook</code> entity now links one library with one book. As you may have noticed, some books are assigned to both libraries, some are not.<br />
<br />
<code>Library</code> contains a method called <code>books()</code>. It returns every book in the <code>librarybook_set</code> as an iterable data structure. Because <code>LibraryBook</code> holds a reference to <code>Library</code>, any <code>Library</code> entity (say, <code>lib1</code>), is given a back-reference to the collection of <code>LibraryBook</code> entities. If you do not define a <code>collection_name</code>, GAE automatically creates one by appending "_set" to the model name. This is where <code>librarybook_set</code> came from, in case you were wondering.<br />
<br />
Given a library entity like <code>lib1</code>, the <code>books()</code> method allows us to easily return all the books at that library by simply assigning or iterating over <code>lib1.books()</code>. The <code>Book</code> model contains a method called <code>libraries()</code> which does just the opposite: allows you to get all the libraries where a given book resides.<br />
<br />
Our data has been created and linked. Now we pass it in to the template.<pre class="code">template_values= {
'lib': lib1.name,
'books_at_lib': lib1.books(),
'forbook': book1.title,
'libs_by_book': book1.libraries(),
'libs_books': libs.libscol.order('name'),
'books_libs': books.bookscol.order('-author').order('title')
}</pre>In this example, we not only display all libraries and all books (via <code>libs_books</code> and <code>books_libs</code>) the way we did in the previous post, but also output all books at a library (<code>books_at_lib</code>), and all libraries that contain a given book (<code>libs_by_book</code>).<br />
<br />
<img src="http://www.arbingersys.com/blog/images/gaemany2.png" alt="" /><br />
Here's <a href="http://www.arbingersys.com/blog/gaemany2_index.txt">the template</a>, if you want to take a look at it.<h4 class="arb-subhead">Denormalize your data</h4>As I stated before, the GAE Datastore is not a relational database. Databases were designed for compactness and efficiency, and normalization is used, in part, as a way to minimize the size of your data on disk.<br />
<br />
The Datastore has been built, first and foremost, with scalability in mind. Scalability means, in essence, "add more servers as needed, without re-writing your code". Specifically to the GAE Datastore, it means "disk space is cheap, stop worrying about it, and scale".<br />
<br />
Consider modifying our <code>LibraryBook</code> model above to look like<pre class="code">class LibraryBook(db.Model):
library = db.ReferenceProperty(Library)
book = db.ReferenceProperty(Book)
booktitle = db.StringProperty()
libraryname = db.StringProperty()</pre>Now, we are not only storing each book's title in the <code>LibraryBook</code> entity, but we are also storing it in the <code>title</code> property of the referenced <code>Book</code> entity. While this is obviously not space efficient, and certainly not the elegant, normalized way of storing relational data our brains are used to, <b>it scales well and is fast</b>.<br />
<br />
It scales because the Datastore runs on who knows how many commodity computers in the background (without the knowledge of our application), and it's fast because we have the most commonly needed fields available immediately. If you need to poke further into the data, like to get the street address of the library, you would use the referenced models, and our JOIN then comes into play.<br />
<br />
(Thanks, <a href="http://groups.google.com/group/google-appengine/browse_thread/thread/e9464ceb131c726f/6aeae1e390038592#6aeae1e390038592">Ben the Indefatigable</a> for illuminating this.)Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-90125163701638711452008-04-28T13:15:00.001-07:002009-10-03T12:52:39.206-07:00Google App Engine: Many-to-many JOIN<img src="http://www.arbingersys.com/blog/images/gae.jpg" alt="GAE" align="left" /><span style="background-color:#ffffbb">Update: After reading this, you might want to check out <a href="http://www.arbingersys.com/20080430/google-app-engine-a-better-many-to-many-join.html">GAE: [A Better] Many-to-many JOIN</a>, which gives an improved way of doing this, plus goes into why you <b>shouldn't</b> normalize your data.</span><br />
<br />
A public library has many books. In SQL-speak, this is a one-to-many relationship. (For the sake of the argument, I'll assume each library has only one copy of a given book). It follows then, that many libraries have many books. This is a many-to-many relationship. On the heels of my recent post <a href="http://www.arbingersys.com/2008/04/google-app-engine-one-to-many-join.html"><span>GAE: One-to-many JOIN</span></a>, here is an example showing <span style="font-weight: bold;">how to do a many-to-many JOIN</span><span> using </span><span>the Google App Engine Datastore</span>.<br />
<br />
You can download this entire sample <a href="http://www.arbingersys.com/dnlds/gaemany.tar.gz"><span>here</span></a>.<br />
<br />
A many-to-many SQL query for our library scenario would look something like<br />
<pre class="code">SELECT
*
FROM
library
INNER JOIN
libraries_books
ON
library.KEY=libraries_books.library_KEY
INNER JOIN
books
ON
libraries_books.book_KEY=books.KEY</pre>To duplicate this functionality in the Datastore, we have to model our data as follows. (Full code listing <a href="http://blog.arbingersys.com/gaemany_example.txt">here</a>.)<pre class="code"># These are used for linking/ordering
class Books(db.Model):
notes = db.StringProperty(required=False)
class Libraries(db.Model):
notes = db.StringProperty(required=False)
# Data models
class Library(db.Model):
name = db.StringProperty(required=True)
address = db.StringProperty(required=True)
city = db.StringProperty(required=True)
library_list = db.ReferenceProperty(Libraries,
required=True, collection_name='ref_libs')
class Book(db.Model):
title = db.StringProperty(required=True)
author = db.StringProperty(required=True)
library = db.ReferenceProperty(Library,
required=True, collection_name='books')
book_list = db.ReferenceProperty(Books,
required=True, collection_name='ref_books')</pre>The <code>Library</code> and <code>Book</code> models share a one-to-many relationship. This is setup using the <code>Book.library</code> <span style="font-style: italic;">db.ReferenceProperty</span>. Nothing really new here (if you read my <a href="http://blog.arbingersys.com/2008/04/google-app-engine-one-to-many-join.html">one-to-many post</a>, anyway).<br />
<br />
We need some additional references to pull off the many-to-many relationships, however, plus a couple extra Models. (It's important to note that the <span style="font-style: italic;">db.ReferenceProperty</span> in itself only allows for a one-to-many relationship. That's why we need more than one get the many-to-many behavior.) I've created the <code>Libraries</code> and <code>Books</code> models for this. You may notice that they have an optional, largely unnecessary property named <code>notes</code>. This can pretty much be ignored. We really just need these entities to exist in order to point to them from our <code>Library</code> and <code>Book</code> entities.<br />
<br />
The <code>Library</code> model contains a reference to <code>Libraries</code> through a property named <code>library_list</code>. <code>Book</code> has a reference to <code>Books</code> via <code>book_list</code>. Having references to both <code>Libraries</code> and <code>Books</code> allows us to manipulate the sorting for each collection, as you will see below.<br />
<br />
When the page loads in our browser, the first thing we do is create entities from our models, and give them some data.<pre class="code"># Library collection
libs = Libraries()
libs.put()
# Books collection
books = Books()
books.put()
# Setup libraries
lib1 = Library(name='lib1', address='street a', city='city1',
library_list=libs)
lib2 = Library(name='lib2', address='street b', city='city2',
library_list=libs)
lib1.put()
lib2.put()
# Books:
# Both libraries
book1 = Book(title='book1', author='author one',
library=lib1, book_list=books)
book2 = Book(title='book1', author='author one',
library=lib2, book_list=books)
# Only first library
book3 = Book(title='book2', author='author one',
library=lib1, book_list=books)
# Both libraries
book4 = Book(title='book3', author='author two',
library=lib1, book_list=books)
book5 = Book(title='book3', author='author two',
library=lib2, book_list=books)
book1.put()
book2.put()
book3.put()
book4.put()
book5.put()</pre>We declare our "link" entities, <code>libs</code> and <code>books</code>, first. Next we create two library instances, <code>lib1</code> and <code>lib2</code>, and assign <code>libs</code> to <code>library_list</code> to create a one-to-many relationship from <code>Library</code> to <code>Libraries</code>.<br />
<br />
A <code>Book</code> entity has two relationships to setup. A one-to-many relationship to a given <code>Library</code> entity, and a one-to-many relationship to the <code>Books</code> entity. These are established through the <code>library</code> and <code>book_list</code> properties, respectively.<br />
<br />
After we store our data, we use the collections in our <code>Library</code> and <code>Book</code> models to create two objects that we will pass to our template.<br />
<pre class="code">libs_books = libs.ref_libs.order('name')
books_libs = books.ref_books.order('author').order('-title')
template_values = {
'libs_books': libs_books,
'books_libs': books_libs
}</pre>Both <code>libs_books</code> and <code>books_libs</code> contain many-to-many relationships between libraries and books. But <code>libs_books</code> references <span style="font-weight: bold;">books from libraries</span>, allowing you to sort by library, and <code>books_libs</code> does the opposite, referencing <span style="font-weight: bold;">libraries from books</span>, letting you sort by books. This is certainly more clumsy and more work than our SQL counterpart, which just needs an ORDER BY clause to sort either way.<br />
<br />
On to the template. To output books by library, we have to iterate over every library <code>lib</code> in <code>libs_books</code>, and then iterate over every <code>book</code> referenced to <code>lib</code>.<br />
<pre class="code">{% for lib in libs_books %}
{% for book in lib.books %}
<tr>
<td>{{ lib.name }}</td>
<td>{{ lib.address }}</td>
<td>{{ lib.city }}</td>
<td>{{ book.title }}</td>
<td>{{ book.author }}</td>
</tr>
{% endfor %}
{% endfor %}</pre>Because of the way references are setup in <code>libs_books</code>, we are able to order the output based on the libraries, as you can see in the first table below.<br />
<br />
<img src="http://www.arbingersys.com/blog/images/gaemany.png" alt="results" /><br />
<br />
The second table above shows the output from <code>books_libs</code>, which we use to <span style="font-weight:bold;">order by books</span>. Here's how we generate the data in the template:<br />
<pre class="code">{% for book in books_libs %}
<tr>
<td>{{ book.title }}</td>
<td>{{ book.author }}</td>
<td>{{ book.library.name }}</td>
<td>{{ book.library.address }}</td>
<td>{{ book.library.city }}</td>
</tr>
{% endfor %}</pre>We don't have to use nested loops, and we simply use <code>book.library</code> as a normal reference (not a back-reference) to get the library associated to the given book. The reason we don't have to nest is because a <code>Book</code> entity has a <span style="font-weight: bold;">many-to-one</span> relationship with a <code>Library</code> entity, so each book is already attached to a <code>Library</code>. <code>Library</code> entities have a <span style="font-weight: bold;">one-to-many</span> relationship to <code>Book</code> entities, so every time you get <code>lib</code>, you have to find it's <span style="font-style: italic;">many</span>, which requires the second loop.<br />
<br />
There you have it. A first blush example, to be sure, but I think it conveys the core steps required to duplicate the behavior of a relational many-to-many JOIN.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-85041265351064827362008-04-26T18:39:00.001-07:002009-05-21T13:33:18.024-07:00Google App Engine: One-to-many JOIN<img src="http://www.arbingersys.com/blog/images/gae.jpg" align="left" alt="GAE" /> By now, no doubt, most developers have heard about the Google App Engine (GAE). And even if you didn't get one of the 10K free accounts, you might still have downloaded and started messing around with the SDK.<br /><br />Google touts the platform's ease of development, and stepping through the samples reinforce that it is, in fact, quite easy. However, it doesn't take long to discover what will probably be the <span style="font-weight: bold;">biggest hurdle for developers entrenched in the relational database paradigm</span>: The Google Datastore. It's <span style="font-style: italic;">not</span> a relational database, and it's not an OOP wrapper to a relational database. It's a web-specialized data storage mechanism, accessed through classes called Models, and objects called Entities.<br /><br />I'm willing to bet that most of the developers playing with the SDK will first really "get" this when they move past the simple "one table" queries in the samples, and try to do a basic JOIN query. Although there is a SQL<span style="font-style: italic;">like</span> syntax called Gql -- as stated in the <span style="font-style: italic;">Docs</span> -- there is no JOIN.<br /><br />To get this functionality, you have to use <code>db.ReferenceProperty</code> to link one object to another. <span style="font-weight: bold;">Here's a short demonstration of how it's done.</span> I figure this is much needed, since there seems to be no good examples for it in the Google documentation. (The best information I could find was in the GAE discussion group.)<br /><br />Below, I've listed <b>example.py</b> in its entirety (don't worry, it's short), and I'll refer to each pertinent section by the line numbers. (You can download the entire sample <a href="http://www.arbingersys.com/dnlds/gappjoin.tar.gz">here</a>. Put it under the SDK folder, and run it like any of the GAE samples.)<br /><pre class="arb-code">1 import os<br />2 import cgi<br />3 import wsgiref.handlers<br />4<br />5 from google.appengine.ext import webapp<br />6 from google.appengine.ext import db<br />7 from google.appengine.ext.webapp import template<br />8<br />9 class MainPage(webapp.RequestHandler):<br />10 def get(self):<br />11<br />12 url = EnteredUrl(url="http://domain.com/page.html")<br />13 url.put()<br />14<br />15 match1 = AffinityUrl(<br />16 url="http://domain.com/dir/page1.html",<br />17 affinity = .83,<br />18 entered_url=url<br />19 )<br />20 match1.put()<br />21<br />22 match2 = AffinityUrl(<br />23 url="http://domain.com/dir/page2.html",<br />24 affinity = .8301,<br />25 entered_url=url<br />26 )<br />27 match2.put()<br />28<br />29 matched_urls=url.matched_urls.order('-affinity')<br />30<br />31 aff_entries = AffinityUrl.all().order('url')<br />32<br />33 template_values = {<br />34 'url' : url.url,<br />35 'matched_urls': matched_urls,<br />36 'aff_entries': aff_entries<br />37 }<br />38<br />39 path = os.path.join(os.path.dirname(__file__), 'index.html')<br />40 self.response.out.write(template.render(path, template_values))<br />41<br />42 class EnteredUrl(db.Model):<br />43 url = db.StringProperty(required=True)<br />44<br />45 class AffinityUrl(db.Model):<br />46 url = db.StringProperty(required=True)<br />47 affinity = db.FloatProperty(required=True)<br />48 entered_url = db.ReferenceProperty(EnteredUrl,<br />49 required=True, collection_name='matched_urls')<br />50<br />51 def main():<br />52 application = webapp.WSGIApplication(<br />53 [('/', MainPage)],<br />54 debug=True)<br />55 wsgiref.handlers.CGIHandler().run(application)<br />56<br />57 if __name__ == "__main__":<br />58 main()<br /></pre>The above stores a URL someone has entered, and then stores other URLs that match it by some degree (the "affinity"). The affinity is a numeric score. This is a simple one-to-many relationship, and to get at the data using standard SQL, we'd write something like:<br /><pre class="code">SELECT<br /> entered_url.url,<br /> affinity_url.url,<br /> affinity_url.affinity<br />FROM<br /> entered_url<br />JOIN<br /> affinity_url<br />ON<br /> entered_url.KEY=affinity_url.FOREIGN_KEY</pre>Here are the steps using the GAE Datastore.<br /><br />Lines 42-49.<br />First, let's define the data Model. <code>EnteredUrl</code> defines a single string property, <code>url</code>, for the obvious reason. <code>AffinityUrl</code> defines a string property for <code>url</code>, as well as a float <code>affinity</code> property, for storing the score.<br /><br />Lines 48-49.<br />Also, <code>AffinityUrl</code> defines a <code>db.ReferenceProperty</code> named <code>entered_url</code>, which refers to an <code>EnteredUrl</code> object. This is the link between our two data objects, and how we effectively do a JOIN. The <code>collection_name</code>, <span style="font-style: italic;">matched_urls</span>, is used to refer to the collection of <code>AffinityUrl</code> objects that will be linked.<br /><br />Lines 12-13.<br />When the page is loaded in the browser we create an <code>EnteredUrl</code> entity named <code>url</code>, setting its <code>url</code> property to a string value.<br /><br />Lines 15-27.<br />We setup two <code>AffinityUrl</code> objects, and assign them both a url and a numeric score. Additionally, we point <code>entered_url</code> to our <code>EnteredUrl</code> object, <code>url</code>. We have just linked one object (<code>url</code>) to many (<code>match1</code>, and <code>match2</code>).<br /><br />Line 29.<br />This line queries the data in the one-to-many way, and stores it in an object, <code>matched_urls</code>, which I pass through to the template for iteration and output. This is where the collection name we defined in the <code>db.ReferenceProperty</code> attributes is used. Note that the collection name, <span style="font-style: italic;">matched_urls</span>, is called like a method from <code>url</code>, since <code>url</code> is the object being referenced.<br /><br />Line 31.<br />Additionally, for illustration, I query the <code>AffinityUrl</code> object data and save it in <code>aff_entries</code>. Just as in SQL, where you can JOIN tables, or query them individually, the App Engine allows you to do both. (Hopefully, you've realized by now that although they look and are accessed differently, these linked entities are behaving quite a lot like relational database tables.)<br /><br />In the template, I output the data from <code>matched_urls</code> by getting each <code>AffinityUrl</code> object in the collection, and displaying that URL. Note that because of the <code>.order('-affinity')</code> call, we are displaying the URLs with the closest affinity at the top (descending order).<br /><pre class="code"><table><br />{% for affurl in matched_urls %}<br /><tr><td>{{ affurl.url }}</td></tr><br />{% endfor %}<br /></table></pre>Load this up in your browser, and refresh a few times, and this is what you get:<br /><br /><img src="http://www.arbingersys.com/blog/images/gappjoin.png" alt="" /><br /><br />You may have noticed from the code that I also pass all the data stored in the <code>AffinityUrl</code> model (line 31) to the template as well. This is output in the second table, above.<br /><br />Because I've refreshed the page several times, I've generated and stored the <code>match1</code> and <code>match2</code> objects multiple times to the Datastore. This highlights something strikingly different about the Datastore and a SQL table. SQL statements like the one I give will display all the entries that match between <code>EnteredUrl</code> and <code>AffinityUrl</code>, even if entries in <code>AffinityUrl</code> are duplicated. As you can see, even though we have duplicate <code>AffinityUrl</code> entities stored, the reference from the <code>EnteredUrl</code> entity is smart enough to realize that <em>they are</em> duplicates, and only displays the ones that are unique. <em>Update: please see the comments for a correction of the previous statements. The Datastore is creating new entities each time with a unique ID...</em><br /><br />The Datastore takes a little getting used to, especially for those experienced in the standard relational data models. (Good ol' <span style="font-style: italic;">paradigm shift</span>.) The GAE documentation feels unfinished or at least rushed, which is unfortunate. I personally think they should have concentrated more on giving good examples that demonstrate mapping relational concepts to Datastore concepts, since the majority of developers looking at the GAE will be old hands at the relational stuff.<br /><br />I'm sure they'll get there eventually. In the meantime, I hope you found this tutorial useful.Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-4678983171170252155.post-11513280572283587772008-04-14T21:42:00.001-07:002009-09-13T22:49:36.456-07:00I'm Trying To Quit... Commercial Software, Pt. 1<img src="http://www.arbingersys.com/blog/images/tryquit-ico.png" alt="Trying to quit" align="left" /> This experiment started out simply enough. It was 2007, and I got a new laptop. I had been running Quickbooks 2004 for our checking accounts, and Office 2003 for our meager office tools needs. I decided this software would stay on my old laptop (now my wife's), and I would try Free Open Source Software (FOSS) alternatives on the new one. I was bored with Office, and fed up with Quickbooks, anyway, so why not?<br />
<br />
From there, the experiment broadened, and I decided to see if Linux/FOSS could keep me from ever having to boot into a proprietary system (Windows), or use proprietary software. I decided to keep notes, and now I seem to have enough material to start sharing the experience.<br />
<br />
This is where it begins. I replace Quickbooks with GnuCash, and Microsoft Office with OpenOffice on my new laptop, which is running Windows XP.<br />
<h4 class="arb-subhead">GnuCash</h4>Since I wasn't sure of anything, I didn't move our checking accounts out of Quickbooks. My wife and I were simply doing the laptop shuffle anyway, so it was just easier to leave everything where it was, and continue to maintain our registers on the old laptop.<br />
<br />
However, <span style="font-weight: bold;">we wanted to start a monthly budget</span>, and I decided to let GnuCash step up and take a shot. Installing GnuCash was as easy as any other Windows application. Simply download the installer and run through the prompts. No sweat.<br />
<br />
After doing a minimal amount of reading, and marginally more button punching and tab poking, I figured out that I would have to first create a register, and then apply a budget estimation to it.<br />
<br />
So I setup a register called <span style="font-style: italic;">Monthly Budgeting</span>. We decided on a monthly dollar amount, and I made this the initial deposit. Then, I began entering our receipts.<br />
<br />
Here's what the register looks like:<br />
<img src="http://www.arbingersys.com/blog/images/tryquit2.png" alt="" /><br />
<br />
So far, nothing surprising or mind-boggling. GnuCash felt a lot like Quicken. There's only so much variation a register is going to have, after all. This is good, because it means that the learning curve from one product to the next is minimal.<br />
<br />
After finishing all my month's entries, I did some more poking around, and finally got the budget estimate working. <span style="font-style: italic;">Hint: Select the Budget tab, click "Options" to set your intervals etc, and then click "Estimate".</span><br />
<br />
Here's our budget after a few months of keeping track. The budget outline for the <span style="font-style: italic;">Monthly Budgeting</span> register is displayed horizontally, each month showing whether you are under budget (positive dollar amount), or over (negative) for that period.<br />
<br />
<img src="http://www.arbingersys.com/blog/images/tryquit.png" alt="" /><br />
<br />
The only problem I've had was with the backup. Quickbooks has an easy backup feature, and the backup is stored in a single file. I've been backing up GnuCash by copying all the files from its directory to a flash card.<br />
<br />
This seems to work okay, but at one point GnuCash (or I, or both) got confused, and I had to restore from the backup directory, and in the end I lost about a month's worth of entries. The backup could be a <a href="http://svn.gnucash.org/docs/HEAD/backuppolicy.html">little easier, I think</a>.<br />
<br />
<span style="font-weight: bold;">GnuCash has worked out well.</span> I've since added my business register to it, and it has all the standard features that you would at least find in Quicken. I'm not an accountant, so I can't really say whether GnuCash could replace Quickbooks for a business. I can say, however, that it seems like a pretty painless way <span style="font-weight: bold;">to not pay </span>for software for managing your personal check registers and budgets.<br />
<h4 class="arb-subhead">OpenOffice</h4>This will be pretty short. I barely ever use MS Word for anything, but occasionally need Excel. My wife uses Word the most, but not in any way that OpenOffice (or even Wordpad) couldn't handle.<br />
<br />
So far, Calc has been sufficient for my spreadsheet needs. There was barely a learning curve, and like I said, I don't make too many heavy demands on a spreadsheet. MS Access is another story, but for me that's more of something that I might use in development (say, of a .NET application, because it was convenient), so I'm not including it here.<br />
<br />
Like GnuCash, I think OpenOffice <span style="font-weight: bold;">ranks high</span><span style="font-weight: bold;"> enough in quality and design</span> to work fine for a very large percentage of home users, and even for a lot of offices. As time progresses, whatever gaps there may be will only get narrower.<br />
<h4 class="arb-subhead">So...</h4>As you might have figured out by now, this experiment is not a feature-by-feature scrutiny of competing products. I'm just using software the way I would normally, which is essentially, <span style="font-weight: bold;">"I don't care about feature X, until I need feature X"</span>. I think most people work this way, unless they have a specific reason to become an expert. I'm not an accountant, doubt I will ever be an accountant, so I don't put a whole lot of time learning every arcane feature available in Quickbooks. I learn enough to do what I want, and won't go further until I need to.<br />
<br />
In this experiment, FOSS is effectively graded on whether or not it can substitute all or most of my proprietary software needs, in the way in which I use software. It is highly subjective, and human nature, like laziness and apathy, is very much a part of it, as you will see.<br />
<br />
<span style="font-style: italic;">(Next up: My old laptop dies, and we have to get another one. I decide to try Linux along with Vista, and see how little I actually have to use Vista.)</span>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-77836509998401253212008-04-10T07:25:00.001-07:002009-05-21T13:33:18.100-07:00Insert or Update With a Single SQL Statement<img src="http://www.arbingersys.com/blog/images/sql.png" alt="sql" align="left" />Ever come across the situation while developing data-driven web applications when you needed to <span style="font-weight: bold;">create a new record</span> if one doesn't exist, but if one <span style="font-style: italic;">does</span> exist, then you need to <span style="font-weight: bold;">update it instead</span>?<br /><br />I certainly have, and I must admit with some shame that in the past I've handled it in the most obvious, and least elegant and efficient way, by<br /><br /><span style="color: rgb(255, 102, 102);"> querying SQL for the existence of the record</span>,<br /><span style="color: rgb(204, 153, 51);">checking the result set in my code by looping and assigning a variable</span>, <span style="color: rgb(153, 0, 0);"><br /><span style="color: rgb(102, 0, 0);">checking the variable for a value</span></span><span style="color: rgb(102, 0, 0);">, and if one doesn't exist, then doing the insert</span>. <span style="color: rgb(51, 102, 102);"><br />Otherwise, doing the update</span>.<br /><br />There are a couple problems here. First, it's a lot more code than necessary. Second, it requires two calls to SQL instead of one.<br /><br />You can eliminate this by making SQL do the conditional logic for you, via <code>IF EXISTS</code>. Here's the sample:<pre class="code">IF EXISTS(<br /> SELECT 1<br /> FROM MY_TABLE<br /> WHERE ITEM='somevalue' AND ENTERDATE='12/31/1999')<br /> <span style="color: rgb(0, 153, 0);">--Update Statement</span><br /> UPDATE MY_TABLE<br /> SET ITEM='anothervalue'<br /> WHERE ITEM='somevalue' AND ENTERDATE='12/31/1999'<br />ELSE<br /> <span style="color: rgb(0, 153, 0);">--Insert Statement</span><br /> INSERT INTO MY_TABLE<br /> (ITEM, ENTERDATE)<br /> VALUES<br /> ('somevalue', '12/31/1999')<br /></pre><code>EXISTS</code> lets you run a query statement, and if a value is returned, it outputs <span style="font-weight: bold;">true</span>. Otherwise, it outputs <span style="font-weight: bold;">false</span>. Couple that to <code>IF/ELSE</code>, and you can see how useful this particular SQL clause is.<br /><br />The query inside <code>EXISTS</code> returns 1 if the parameters in the <code>WHERE</code> clause match, and returns nothing otherwise. What we return really doesn't matter. We're interested mainly in the parameters. If the parameters match something, then we will update them. Otherwise (<code>ELSE</code>), we insert them into the table.<br /><br />Pretty simple. We just add our code parameters to the above statement (if your language uses parameters, e.g. Perl or C#), and send it on its way. One SQL call, and a lot less logic.<br /><br /><span style="font-style: italic;">Update: I should have been clearer. This is TSQL, and will not work, in say, MySQL. (Thanks anonymous commenter!)</span>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-27892796392392417252008-03-19T08:41:00.001-07:002009-05-21T13:33:18.089-07:00Reverse Callback Templating<img src="http://www.arbingersys.com/blog/images/perl-com-logo.gif" alt="perl.com" align="left" />I've just had my <a href="http://www.perl.com/pub/a/2008/03/14/reverse-callback-templating.html">first article ever</a> published on <a href="http://www.perl.com/">Perl.com</a>. It covers a template module I've written -- in Perl, obviously -- called <a href="http://search.cpan.org/%7Egilad/Template-Recall-0.11/">Template::Recall</a>.<br /><br />Template systems provide a way to separate concerns, that is, design from logic. I won't cover it here, because that would be more than a tad redundant. If this topic interests you, here's the article link:<br /><br /><a href="http://www.perl.com/pub/a/2008/03/14/reverse-callback-templating.html">http://www.perl.com/pub/a/2008/03/14/reverse-callback-templating.html</a><br /><br />Also, you might want to read this conversation on Perlmonks.com:<br /><br /><a href="http://www.perlmonks.org/?node_id=674225">http://www.perlmonks.org/?node_id=674225</a><br /><br />You'll see that template systems are a much debated topic. And if I may venture a personal observation, the Perl language has covered the topic more than any other language out there, and in much greater depth.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-71012206391055341892008-02-20T07:11:00.001-08:002009-05-21T13:33:18.076-07:00My Two Perls<img src="http://www.arbingersys.com/blog/images/twoperls.png" alt="My Two Perls" align="left" />Perl's greatest blessing and greatest curse, in my opinion, is CPAN. CPAN is an unbelievably rich repository of modules that do everything imaginable. I can't think of another language that has a resource like it. But using CPAN on the most widely used desktop platform available, Windows, presents some problems. Here is one developer's Perl on Windows saga.<br /><br />Historically, I've always run ActiveState Perl. It's a great Windows distribution, and ActiveState has done a lot of work to make it very user friendly, especially by creating PPM, the Perl Package Manager. As opposed to the standard CPAN installation mechanism, which generally expects you to "make" your modules, sometimes compiling sources, PPM provides pre-compiled packages, so it's no hassle at all to install them. It's just a download/copy operation, really. The problem here is that if ActiveState's PPM repository doesn't have the module you want, you're back to compiling from source.<br /><br />At some point (as I became nerdier, I guess), I decided to play around with compiling my own version of Perl and bundling it with a few important web modules from CPAN (i.e. CGI, CGI::Ajax, DBI, SOAP::Lite, Template Toolkit, etc), along with Apache/mod_perl, and MySQL. I decided to make this a distribution, and named it <a href="http://www.arbingersys.com/hostsites/zangweb/"><span style="font-weight: bold;">zangweb</span></a>. It was intended to give a Windows developer everything he needs to start programming in Perl/Apache/MySQL with as little effort possible.<br /><br />zangweb Perl replaced ActiveState on my machine for some time. I no longer had the convenience of PPM, so I just went ahead with the standard CPAN way of installing modules. For most modules, this wasn't too big of a headache. You just need to be sure to have a working development environment, one with nmake.exe available, and most modules installed without difficulty. Generally, I did something like<br /><pre class="code">c:\>vcvars32<br />c:\>perl -MCPAN -e shell<br /></pre>and installed from the CPAN prompt.<br /><br />However, modules like PerlMagick, or any others that had complex C/C++ builds and originally been developed for *nix, <span style="font-style: italic;">did not</span> build easily. They took a lot of work, and while I thought it was kind of fun, from a hobbyist standpoint, I don't know if under other circumstances I would have wanted to go through all that trouble.<br /><br />Nonetheless, zangweb worked well, and I was pretty content. Then Perl 5.10 was released, and it was available from ActiveState in short time. I wanted to try 5.10, naturally, and as usual, the path of least resistance was ActiveState. I downloaded it and ran it alongside zangweb Perl at work. On my own laptop, I decided I would try a different configuration: ActiveState Perl 5.10, and standard Apache and MySQL installations. Kind of as a comparison to see how valuable zangweb really was.<br /><br />I realized the only thing that made zangweb more valuable was <span style="font-style: italic;">all the work I had done to get those web CPAN modules compiled and installed</span>. Yet again, it boiled down to the modules, and the difficulty that came with installing them on Windows. For instance I want to have PerlMagick. I have it for <a href="http://www.arbingersys.com/hostsites/zangweb/extensions.html">zangweb</a>.<span style="font-style: italic;"> </span>So far, ActiveState doesn't for v5.10:<br /><br /><a href="http://ppm.activestate.com/BuildStatus/5.10-P.html">http://ppm.activestate.com/BuildStatus/5.10-P.html</a><br /><br />But you might get lucky, and find some kind soul who has created and bundled it in a PPM friendly package:<br /><br /><a href="http://www.google.com/search?hl=en&q=filetype%3Appd+perlmagick&btnG=Google+Search">http://www.google.com/search?hl=en&q=filetype%3Appd ...</a><br /><br />But I don't want to have to rely on the kindness of strangers to get my "must have" modules.<br /><br />Recently, I wanted to do some charting in Perl. After looking around at the modules, I decided I wanted to use GD::Graph. This relies on <a href="http://www.libgd.org/">libgd</a>. At the time of this writing, they don't have a compiled binary for Windows for the latest revision. So now I've got compiling ahead of me once again.<br /><br />After trying unsuccessfully to get it to compile natively on Windows, it dawned on me: Since CPAN is designed so much in the *nix way of doing things, why not make my second, "alternate Perl" run under an emulation of the Linux system? All the tools that are usually expected by these kinds of libraries, bash, configure, make, etc., are there, so surely I'd have a much easier time getting these modules on my machine this way.<br /><br />No way to know until you try. I installed Cygwin, which came with Perl 5.8 already bundled. GD::Graph expects you to have libgd already compiled, so I went through the steps to do this, using my freshly installed Cygwin bash shell.<br /><br />This is where the story gets remarkably pleasant.<br /><br />I downloaded the libgd source, and after reading the README, downloaded the libraries it required, i.e. libpng, and freetype. These two compiled no problem. I jumped back over to the libgd source folder, did its configure and make steps, and after waiting a while for things to compile (something I'm not real fond of, I must admit), had a working version of libgd. The CPAN install of GD::Graph was a breeze after this, and soon I was charting in Perl, happy as could be.<br /><br />Soon enough, I began to wonder why I wasn't just using Cygwin Perl as my main, and perhaps only, Perl distribution. I tried to think of anything I was doing with Perl that was only available to a Win32 distribution. (Yes, I know, that is kind of funny in retrospect.) Nothing came up.<br /><br />The only thing I wondered about now was whether running Perl under emulation would be significantly slower than a natively compiled version. I know it should be slower. The more important question was would it be slow enough to matter?<br /><br />The quickest, most basic way that I could think to check was to make Perl count. So I ran the following with each of my Perls:<br /><br />ActiveState:<br /><pre class="code">perl -e"$a=time; for($i=0;$i<=100000000;$i++){} print time-$a" 21 </pre>Cygwin:<br /><pre class="code">perl -e'$a=time; for($i=0;$i<=100000000;$i++){} print time-$a' 19</pre>Cygwin was faster by about 2 seconds. This satisfied me, initially. At least I knew that there wasn't an embarassing difference in performance. Curious, now, however, I found some good benchmark tests on the web, primarily for comparing the <a href="http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=python&lang2=perl">performance of different languages</a>, but definitely useful for what I was trying to do. I downloaded the <span style="font-weight: bold;">nsieve</span> Perl code. This performs the <a href="http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes">Sieve of Eratosthenes</a>, and is a way of finding primes.<br /><br />Here are the results:<br /><br />ActiveState:<br /><pre class="code">perl C:\cygwin\home\nsieve.pl 7<br />Primes up to 1280000 98610<br />Primes up to 640000 52074<br />Primes up to 320000 27608<br /><br />11<br /></pre>Cygwin:<br /><pre class="code">perl nsieve.pl 7<br />Primes up to 1280000 98610<br />Primes up to 640000 52074<br />Primes up to 320000 27608<br /><br />11<br /></pre>They both ran in 11 seconds. I'm reasonably satisfied that Cygwin, for most of my development purposes, will be fast enough.<br /><br />So that leaves me with a nagging question. Why am I running two Perls? Unless there was a specific case where I need ActiveState -- performance or compatibility with some poorly designed app -- why not just run the Perl that works with CPAN?<br /><br />Then <b>My Two Perls</b> can become <b>My One CPAN-Compatible Perl</b>. I like the sound of that, as a matter of fact. Because really, that's what it's been about all along.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-20971618703341137222008-01-13T20:30:00.001-08:002009-05-21T13:33:18.061-07:00High Level Languages Are Magic<img align="left" style="border:0;margin:3px" src="http://www.arbingersys.com/blog/images/magic1.png" alt="" border="0" /> After pondering the recent flap about how CS departments aren't providing a sufficient education by starting students in Java and ignoring lower level languages [<span style="font-size:85%;"><a href="http://www.stsc.hill.af.mil/CrossTalk/2008/01/0801DewarSchonberg.html">link</a>, <a href="http://www.joelonsoftware.com/articles/ThePerilsofJavaSchools.html">link</a>, and <a href="http://www.codinghorror.com/blog/archives/001035.html">link</a></span>], it seems to me that the problem can be boiled down to the simple fact that <span style="font-weight: bold;">high level languages do too much work for you</span>. They make it unnecessary to think about the low level things that cause the code work. It becomes easy to think of those things as "magic", and by and large dismiss them. Magic is an important productivity booster, but should be implemented only after understanding, to some degree, the little cogs that help it arrive.<br /><br />High level languages do work hard for you, and I consider this an ultimate good, because I have a lot of work to do, and want to <a href="http://www.arbingersys.com/2007/12/test1.html">produce results as quickly as possible</a>. One of the mantras of Perl is that given a context, it will simply <a href="http://www.perlfoundation.org/perl_5_10_now_available">Do The Right Thing</a>. Java and C# make it unnecessary to think much (if at all) about pointers. This makes my life a whole lot easier. But it still helps to understand lower level concepts, for instance when considering the performance of various objects in a language, like <a href="http://groups.google.com/group/microsoft.public.dotnet.framework.performance/browse_thread/thread/200db2dbab439309/302cbd93506eba51">StringBuilder in C#</a>*.<br /><br />I think magic is a danger even beyond CS departments. It's also inherent in productivity tools like Visual Studio, which will probably be learned on the job. If you only learn to use the magic, but don't understand that <span style="font-style: italic;">it isn't really magic</span>, then you're headed for trouble.<br /><br />I have a contractor who works for me developing ASP.NET applications. Out of college he didn't know C#/ASP.NET, but at a previous job had picked it up <span class="me">à la </span>Visual Studio. I was just getting into ASP.NET myself when he hired on, and was a little confused about how the [auto] postback worked. I thought it would be quickest to ask someone with experience, so I did. But he didn't know, even though he was already producing fairly complex web applications for us. In his view, it was just a feature of ASP.NET, and beyond that was not important, as long as he could turn it on or off in the Properties of the various controls.<br /><br />I had reasoned that it must be JavaScript, unless ASP.NET installed some sort of binary control on the sly. Sure enough, <a href="http://www.xefteri.com/articles/show.cfm?id=18">it turned out to be JavaScript</a>. I decided then and there that our contractor was at a disadvantage because he <span style="font-weight: bold;">understood web development primarily through Visual Studio</span>, and this hindered him from realizing that ASP.NET was made to fit a set of (effectively lower level) standards, not the other way around.<br /><br />We do a lot of programming in ASP.NET using Visual Studio, and it <span style="font-style: italic;">is</span> a productivity booster. We're building more powerful applications in shorter time frames, and with less effort. Its magic is definitely appreciated. But seeing beneath the magic is what allows us to really understand and fix bugs, and build robust, maintainable applications.<br /><br />Magic is for productivity. It's for those who have gotten the education, and the education is gotten by understanding the little cogs and how they relate to one another.<br /><br /><span style="font-size:78%;">* This discussion, I might add, jumps right into the low level arguments of memory management, showing just how far removed you really are from those little cogs...</span>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-62349926605660483752007-12-21T08:32:00.001-08:002009-05-21T13:33:18.270-07:00The Case for Flat-Threaded DiscussionsAs I stated in a <a href="http://blog.arbingersys.com/2007/12/mr-spolsky-and-work-is-life-principle.html">previous entry</a>, I've recently built and released an open source "conversation" system called <a href="http://sylbi.arbingersys.com/">Sylbi</a> (currently in beta). This system was based on the idea that <span style="font-weight: bold;">blogs with comments</span> and <span style="font-weight: bold;">forums</span> differ very little, and there was no reason why you couldn't build a system that could be both a forum <span style="font-style: italic;">and</span> a blogging platform.<br /><br />Because Sylbi provides the ability to have discussions, that is, multiple people respond to each other's posts over time, it had to deal with how to display those conversations. The two most common ways for doing this are the <span style="font-style: italic;">flat</span> and <span style="font-style: italic;">threaded</span> models. For a detailed and intelligent commentary on the virtues of these methods, see <a href="http://www.codinghorror.com/blog/archives/000733.html">this post</a> from Coding Horror, and <a href="http://www.joelonsoftware.com/articles/BuildingCommunitieswithSo.html">this one</a> from Joel On Software.<br /><br />As I began thinking about this problem, I decided that there is a third method for displaying conversations, one that I feel is preferable to the other two: <span style="font-weight: bold;">threading without indention</span>, or as I like to call it <span style="font-style: italic;">flat-threaded</span>. Here is my conclusion, posted on the official "blog" for the Sylbi project. (You can read <a href="http://sylbi.arbingersys.com/demo/conversation.cgi?rm=blog&uid=5&id=52&pid=213">the full post here</a>, which talks about this as well as the other unique features of Sylbi.)<blockquote>It is my opinion that threading a conversation, that is, grouping replies to a post immediately below that post, provides the most logical organization method. Slashdot discussions are threaded, as are those on reddit. However, I think that indenting replies adds no real value, and instead actually makes the conversation more difficult to read. Sylbi threads conversations, but uses no indentation. So as you scan posts from top to bottom, post replies are clustered together, but you must use the content of the posts to determine the grouping. I refer to this as a "flat-threaded" conversation. Sylbi provides the means to quote previous posts, if this should be necessary.<br /><br />Here's why I think this view works. Books are written from top to bottom. If an author refers to something that occurred in a previous chapter, you rely on your memory and comprehension to understand the reference. If the reference is subtle enough, an author may quote himself. Where a conversation is concerned, I think that memory and comprehension don't need to be aided by indentation, and where a reference may require it, you can easily provide a quote.</blockquote>I am committed to "eating my own dogfood", and so am using Sylbi while I work on it. I have a <a href="http://sylbi.arbingersys.com/demo/index.cgi">live version</a> running on my web hosting provider and use it to identify problems with my code as well as my assumptions.<br /><br />One of my initial tests was of the flat-threaded view, and I created <a href="http://sylbi.arbingersys.com/demo/conversation.cgi?topic=Eating%20Our%20Own%20Dogfood&tid=2&id=1&pid=1">this conversation</a> (which I unfortunately made a little difficult to read by using tons of self-references) and began using it to probe the concept. This was a discussion, so as I coded, I tested by adding to it, and eventually, this analogy fell out:<br /><blockquote>Consider a real conversation amongst a group. A topic is started by Alice, and Bob and Charlie discuss it with her for a length of time. Then, Bob touches on an individual point of Alice's initial topic, and a segue is created. Let's say that only Charlie and Bob discuss this point. Alice is silent. But she hasn't said everything she wants about the initial topic, so after they are finished, she brings them back to the topic, and they discuss it further. Bob's segue "held place" for additional comments by Charlie and Bob, and then the original topic was resumed. Viewed in a linear sense, this is exactly what a flat-threaded conversation does.<br /></blockquote>The <span style="font-style: italic;">holds place</span> comment above is in reference to the (at least logical) "fairness" of grouping responses together. Because it is likely that a response to a post may come days after other posts have been made, and earlier posts are pushed down as this latecomer is inserted below the post it's a response to. For example:<br /><ul><li>Initial post (entry) E <span style="color: rgb(51, 153, 153);">[day 1]</span></li><li>Response (to E) R1 <span style="color: rgb(51, 153, 153);">[day 1]</span></li><li>Response (to R1) R3 <span style="color: rgb(255, 0, 0);">[day 2]</span><br /></li><li>Response (to E) R2 <span style="color: rgb(51, 153, 153);">[day 1]</span><br /></li></ul>A response to a post becomes a <span style="font-weight: bold;">subordinate post</span>, as R3 is a subordinate to R1 above. R1 comes before R2, because it was posted earlier. So any responses to R1 get inserted directly below it, ahead of other, potentially earlier posts (R2). So R1 <span style="font-style: italic;">held place</span> for R3, and it <span style="font-weight: bold;">had the right to</span> since it was made earlier. This is a sort of "first come, first serve <span style="font-style: italic;">for all my children</span>" mentality. But it serves an important purpose: to keep direct responses together, which provides better cohesion, I think.<br /><br />Of course, there is a caveat. The <span style="font-style: italic;">holds place</span> idea is susceptible to gaming. For instance, if you want to have your entry appear higher up in the list of responses, you could respond to a higher level response, <span style="font-weight: bold;">even if the content of your post is not particularly relevant</span> to that one.<br /><br />Taking our example above, let's say that it's days later, and there are over 100 responses. You want to post, but hate the idea of being all the way at the bottom of the list. So you pick the first response below the initial entry, and respond to it, but really, you just want to sound off on the original entry. Because you are responding to R1, the system inserts you at the <span style="font-weight: bold;">bottom of the subordinate list for R1</span>, which puts you higher in the list than other posts that followed the rules.<br /><br />This is somewhat mitigated, however, by the fact that in 100 responses with no indention, it is difficult to be entirely clear which post is actually subordinate to which, and therefore where your post is going to appear vertically. It will be much more reasonable to simply respond to a post when you feel that the content of that post requires one.<br /><br />On the other hand, this may simply be a risk involved with human communication, and a small one at that. Further on in the dogfooding conversation above, I observed that real human conversation is far from trouble free.<br /><blockquote>Alice starts a topic with Bob and Charlie. A segue is created, and Alice interjects that they are getting off the subject, and Bob and Charlie return from their tangent. Or they don't, and Alice's conversation is hijacked. I've also seen this conversation pattern (and been involved in it from probably all perspectives).<br /><br />So I think that, basically, when talking about the "natural" flow of conversation and the mantra that trying to mimic this in a forum [is good], it should be noted that real conversation is not necessarily a smooth or clean or non-anarchic interaction. It can be, but it can also be an incredible mess, incredibly trite, or some mix of both.</blockquote>Ultimately, I think the flat-threaded method provides a slightly better view of online conversations by trying to be as <span style="font-style: italic;">contextual</span> as possible, and simplifying the presentation. However, just like real conversations, much depends on the humans.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-4678983171170252155.post-78352041432123926142007-12-15T15:20:00.001-08:002009-05-21T13:33:18.046-07:00Dormant Sticky Memory and Layered Comprehension<p style="color: rgb(0, 0, 0);"> I recently finished reading <em>Descartes: The Project of Pure Enquiry</em> by Bernard Williams. As soon as I read the last page, I moved back to chapter 2, and started again from there. This is because I had retained and comprehended only about 50% of the book. Through the years, <span style="font-weight: bold;">as I've learned better how to learn</span>, immediately rereading has become an invaluable device for me, especially with a subject where I lack familiarity or educational background. (Like philosophy.)<br /></p><p style="color: rgb(0, 0, 0);">If you had asked me on page 303 (the last one) to recall or explain anything from chapter 2, I would have been hard pressed to give you an answer. Just now, having finished reading the chapter again, I'd say that I grasped it nearly in full.<br /></p><p style="color: rgb(0, 0, 0);">What I found really interesting, however, was how those things that I wouldn't have been able to recall at the end of the book <span style="font-weight: bold;">jumped out from somewhere in the back of my mind</span> the moment I read them again. For instance, there is an argument about "false lemmas" that uses an analogy about owning a Ford. After rereading the first few sentences, I could recall the full argument in most of its detail.</p><p style="color: rgb(0, 0, 0);">So there must be some aspect of memory that works like a hard drive. (There is: it's called long term memory.) It just dumbly writes the "file" there in one of its sectors, where it resides unknowingly until something recalls it and loads it into short term memory (RAM), where you can actively use it.<br /></p><p style="color: rgb(0, 0, 0);">Here's a useful little graphic from Wikipedia (<span style="font-size:85%;">note: this model is criticized for being too simplistic, but it fits pretty well with how memory works <span style="font-style: italic;">upon personal reflection</span>, so it's still a useful visualization, I think</span>):<br /></p><p style="color: rgb(0, 0, 0);"><br /><img src="http://www.arbingersys.com/blog/images/Multistore_model.png" alt="" border="0" /><br /></p><p style="color: rgb(0, 0, 0);">When I first read the book, I had very little stored on the subject of Descarte's <span style="font-style: italic;">Cogito ergo sum</span>. Mr. William's book is a thorough analysis of the subject using modern logic, with the benefit of centuries of debate preceding him. In short, it was a <span style="font-weight: bold;">pretty steep curve to dive into</span>. This is why I think that on my first pass I retained and ultimately comprehended so little.</p><p style="color: rgb(0, 0, 0);">On the second pass, it was quite different. I had obviously retained more than I thought, but since it wasn't coupled with strong comprehension, it seems to have been just rather "dumbly" stored. I doubt that if I had never read the book again, I would have been able to explain the "false lemmas" argument. Perhaps I would have recalled hearing about it somewhere, but it would have been foggy.<br /></p><p style="color: rgb(0, 0, 0);">But as I <span style="font-style: italic;">re</span>read, my mind already had some notion of the concepts, and so comprehension occurred more rapidly and to a fuller extent than before. You might say that my <span style="font-weight: bold;">comprehension came about in a layered manner</span>. A hazy concept lay in memory, was fortified by reprocessing the original text, and then stored again (to disk!) as a much more useful item.</p><span style="color: rgb(0, 0, 0);">This makes me think of my early days learning to program, when there were plenty of concepts I was unclear about, and I was rereading all the time. I was playing around with QBASIC on a DOS computer, then tried my hand at Turbo Pascal. Languages ultimately without a future.</span><br /><br /><span style="color: rgb(0, 0, 0);">But I learned the "primitives" of programming from those languages: variables, looping, conditionals, routines, etc. <span>This is a layer of comprehension and sticky memory </span><span style="font-weight: bold;">still employed today</span>. In fact, it's quite clear to me that despite the plethora of languages available, with all their different syntax and conceptual leanings, the actual number of concepts you need to understand <span style="font-weight: bold;">really well</span> are not that numerous. And once you've obtained and stored those layers, further comprehension occurs much faster.<br /><br />For example, once you understand C pointers and how they work, all reference work in any language, whether Perl, C#, Java, Python, is easy to understand. The nuance presented by the language is just another, usually small, comprehension layer that must be added.<br /><br />As new programming paradigms appear, I notice that I am able to grasp them much more quickly than I did the primitives from my early stages of instruction, even though those concepts are usually much more abstract and difficult. This is because, I think, <span style="font-weight: bold;">like the second reading of my book</span>, necessary, prior concepts are lying dormant in their sectors, ready to be loaded and rehearsed. Except it's more like the <span style="font-style: italic;">n</span>th reading, where <span style="font-style: italic;">n</span> is a pretty high number.<br /><br />So if you're new to programming, are overwhelmed by concepts and language choices, or feel like you're learning at much too slow a pace, never fear: if you stick with it and do the work, you will soon notice your comprehension and retention accelerate.<br /></span>Unknownnoreply@blogger.com0