Saturday, 26 July 2008

The Power of ASP.NET

To all you web admins and coders of web servers: please make sure to fail gracefully. What does that mean? Well, you shouldn't present your potential readers with such hilarious demonstrations of your own incompetence:
HTTP/1.1 404 Connection: close Date: Sat, 26 Jul 2008 13:35:29 GMT 
    Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET 
What's that? A decent 404 message? No! A decent 404 message should include at least an excuse, a clear description of the problem, (not everyone knows how to make sense of something like HTTP/1.1 404 Connection: closed.) and probably a sitemap or search resource to find the content the user came to your web page for in the first place!

Friday, 25 July 2008

Safer startx

Many people claim one should be using a display manager for starting one's X session. One oft cited reason to do so is that when you use startx to fire it up, a malicious person could just circumvent any lock you may have put on your X-session by [ctrl]-[alt]-[←]'ing the server and then being dropped to the console.
However, you can just use the following command:

 startx;exit 
This will exit your session once startx exits. Since X traps all signals, switching to the tty and trying to suspend it won't work.
There's one catch though: my default shell (zsh) will warn you about programs you may still have running in the same shell before exiting and thus prevent the exit command from actually doing its job. So you should be careful about this, or just alias startx so that it first turns off the offending shell option.
This is, of course, also handy for safely giving away restricted prompts to other users (su someuser comes to my mind).

If someone knows a reason this is not safe, please tell me :-)

Edit: I initially thought the pun in this post's title would work only in German, but it appears to be OK for English, too. Thanks to ke for pointing this out.

Update:

I'm such a jerk. Just use the shell's built-in exec. Sheesh.

Wednesday, 23 July 2008

About the Inertia of Programmers

The Adoption Formula

Over at the Arc Language Blog, there's a nice post trying to give a reason why cool new technologies like Haskell don't catch on nowadays - and why cool old technologies like Lisp didn't catch on some years ago. I'd toss in some Prolog, too - although one reason for Prolog not to be adopted was surely that it was a pain to run on olden days machines.

What they say, basically, is that languages like Lisp or Haskell (or Prolog) are solutions in search of a problem. Not that they're not cool. They are. But who needs higher order functions in programming?
The basic problem is a relation between a certain user's crisis (set of problems, if you so prefer) and the pain of adoption. Let's face it: programmers are a lazy bunch. We'll try not to deviate from the known path any more than we absolutely have to. We remember what a pain it was getting here in the first place. So we have to have a big crisis and some solution that can easily solve this crisis by offering us a way to approach our problems that feels similar to what we already know. Since Haskell, Lisp and Prolog all offer solutions to a set of problems most programmers are not even aware they have (and therefore probably don't really have in the first place), not many people bother picking them up. If they offered solutions to problems programmers know they have, they would see a wider range of adoption.

To make it a bit more clear, take Linux as an example. When Linux saw the light of day, there was an operating system crisis of some sort: Windows dominated the markets and UNIX-based solutions for home PCs weren't cheap or viable (or at all available). BSD was closed source, the HURD was the same vapor it's today. Linux offered UNIX-geeks an easy way out: an operating system that's the same in many aspects, except for those that weren't good about the old one. So that's why Linux got big in geek-world.
But on the desktop market, where grandma and all those pathetic stereotypical example PEBKACs can't tell the difference between a browser and a mail application, Linux is still largely unheard of. Why? Well, Linux solves problems grandma doesn't even know how to spell, nevermind realise having them. Free software? DRM? Hardware support? Stability? Security? Who cares?
Spinning 3D cube? Oooooh, shiny!

That's All Folks... is it?

I'd argue that there is one more aspect to it: if technologies like Lisp or Prolog didn't get adopted, why are they not forgotten? Heck, even COBOL is still around and COBOL is everything but cool. There's an easy answer: those languages found a niche. Lisp has always been big in AI development. COBOL, who knows. Prolog has always had a nice and warm room in the basement of universities - and is now catching on in NLP. In those niches, people come from a different background. They have a different notion of pain. For me, managing pointers, malloc() calls and machine code are a pain. I can deal with the first, don't really understand much about the second and have an allergy to the third. But higher order functions? I've had dreams which happened in intensional logic (that was after reading a paper by Groenendijk & Stokhof, I think)! That was a weird feeling waking up, I can tell you.

Also, our crises are different: we need to interface with inferencing tools, and translate logic and mathematical formulae directly into machine instructions. Doing this in C is a pain. I have nothing but admiration for the people behind prover9/mace4, but I'd argue that Haskell is more suitable for that task. A solution that is less of a pain for us is Prolog. You can program almost the same way you write things down in theory (and quickly see where your theory is failing. Damn). You can then go on debugging your theory and back again to punching it in. To a person doing work in IT business, Prolog will look like a complete disaster. What, no unit tests?. Well, no. But you have guitracer/0 :-).

So, I don't think that Prolog has to catch on. It's good enough if it stays around. Haskell is more suitable for an even wider range of tasks (basically everything), but probably even more of a pain to get started with. But it seems that recently many people got interested in Haskell. The Programming Reddit is full of Haskell lately. And there is now a haskell-beginners list and it seems that introductory CS courses at Tübingen University are being taught in Scheme. Functional programming has received a lot of attention lately - let's see what comes out of it.

Meanwhile, I don't care. As long as Haskell and Prolog continue to be developed and used by a certain niche population, I'm happy. I'll be using them happily. They solve some crises I have, they are not a big pain to adopt if you come with a strong background in logics and sorted type theory. And every time I cuss at Prolog, I have to remember: would it be any better doing this in Java? Probably not.

Oh, and by the way: if you happen to wonder what good books there are on learning logic: Read Gamut's Introduction to Logic and also Volume 2, Introduction to Intensional Logic. Some of the best books I've ever had the pleasure to read. Somewhere next to Camus' Stranger, Dostojewski's Idiot and Nietzsche's Zarathustra.

Friday, 18 July 2008

Advertent Curt

Wohoo, I handed in my BA thesis topic today:

From Questions to Queries

Enhancing the Curt System by a Theory and Implementation of Embedded WH-Questions

The Curt System is implemented in Prolog and presented in the last chapter of Blackurn & Bos (2005), a brilliant book about Vincent, who loves Mia, every owner of a hash bar and foot massages. Oh, and computational semantics. And boxers.

Curt is short for Clever use of reasoning tools. Blackburn & Bos use external inferencing tools, a DCG grammar and an emulation of λ-Calculus in Prolog to put together a nice system that is able to parse sentences according to a toy grammar, construct (discourse) models and logic representations of the discourse. They use inferencing to check the discourse for satisfiability and validity.

They develop several variants of the Curt System throughout that chapter, gradually introducing external inferencing, satisfiability checks, validity checks, model building, world and situational knowledge, ontologies and, last but not least, a way to interpret questions in their framework. The last version of the system is codenamed "Helpful Curt" - it can answer basic direct WH-questions by substituting the WH-phrase for an existential quantification over an uninstantiated variable and running a bit of inferencing magic on it in order to find a result.

Advertent Curt

But I found that a bit ad-hoc. Basically, all of the system is a tad ad-hoc (they use Keller Storage for it still!) and it could use some work in order to become really awesome (although it's already quite awesome!). I'd like to show in my thesis that it's possible to introduce a formal account of question semantics into the system. I'll be following Groenendijk & Stokhof's analyses for the most part, also arguing about Karttunen's earlier approach. The main reason for me choosing G&K over Karttunen was that the former use Ty2 to describe their system, whereas Karttunen stays in traditional Montague IL, which would be harder to transfer into the Prolog world.

Ideally, the system should be able to handle the following types of questions:

(1) Vincent knows whether Mia loves Marcellus
(2) Marcellus believes Mia loves Vincent
(3) Every owner of a hash bar knows who likes hash

But why embedded questions?

Almost all accounts of formal question semantics (I know of) treat direct questions as a subtype of embedded questions (Well, except Hamblin, but that's obvious[ly not adequate here]). Once we solve embedded questions, implementing a few transformation steps to translate (4a) into (4b) should pose no problem.

(4 a.) Tell me who owns a hash bar.
(4 b.) Who owns a hash bar?

In this regard I'm taking the very opposite of the more pragmatic approach by Blackburn & Bos, which explicitly handles direct questions only.

I will try to keep the blog up-to-date with progress and ideas :-)

Saturday, 12 July 2008

Anachronism isn't what it Used to Be...

I'm not an English native speaker - and sometimes I find myself in a situation where I need to look up a word from the dictionary in order to find more graceful-sounding synonyms, et cetera. But the dictionary is not always very helpful. Consider the following query in a thesaurus:

Synonyms for begin: … inaugurate …
Well - using to inaugurate right away would lead to strange results, because it's not really used for anything besides politics nowadays.

Concordance and Collocation

This means that sometimes you'll have to not only search for the translation or synonyms of a word, but also for usage examples. A good Dictionary typically provides you with some, but is also usually limited in the amount of examples it can display. This is why looking at occurrances of a word in real text together with their immidiate context is sometimes unaviodable (and in fact the way dictionaries are made). Linguists call that concordance, for grammatical agreement (e.g. using the right temporal form or aspect or the right preposition) and collocation for statistic relatedness (two or more words that form a phrase or a certain reoccurring pattern in language). What linguists now do to find out how a word is used, is, they search for the word in corpora and then classify its context - and patterns thereof.

A Poor Man's Corpus-engine: Google

But such corpora are usually hard to get or expensive, since virtually every accumulation of a non-trivial amount of text will contain copyrighted material (and it's often not easy (read: expensive) to enhance the signal-to-noise ratio). So, as a poor man, one has to resort to Google or equivalent search machines. Searching for a particular word or phrase in Google will often give satisfactory results for its most typical usage patterns. Phrase queries often help to narrow the scope of possible collocations.

… and while doing a bit of research I discovered the following: A word is archaic if the first two pages of Google hits for it return only results from dictionaries. Try searching for advertent (heck, it's not even in vim's default word file on Debian anymore…) as an example...

At Night, Alone with the Computer...

My favorite Prolog dialog

?- true. true.

A lonesome bungalow in the forest; a man; his computer. A conversation arises.

Welcome.

Welcome

This is probably just a temporary location, 'till I find some other place on the Net. My blog used to be hosted at Baywords, but they seem to have screwed up something.

So long, have fun.