Real World Computing
The missing LINQ
End Function
Now, I'm not saying this is the only way of doing this job, nor is it perfect, as a file could be changed or disappear between you doing the query and examining the contents of the dataset. However, it does work on a real-world website, and it's simple enough to understand. If you need to watch a folder for file changes, then rather than polling this query more frequently, you'd be better off using the FileSystemWatcher component in .NET, which looks in the Windows handler stream for notifications of files being changed in a folder.
You may also feel that this is an over-complicated way to display a list of filenames, which it may well be, but your simple query that gets the filenames contained in a folder could easily be adapted to compare two folders and just return the differences, or to compare the actual files themselves - it's open to all sorts of further possibilities. Also, bear in mind that this code may not work by the time you read this article, if Microsoft has developed the technology further.
So there you have it: a real-world use for LINQ, something you probably weren't expecting to read about here. What a shame we can't perform this sort of query via SQL. Or can we?
While researching this topic, I found the RitmarkFS project, which does indeed allow you to query the filesystem via MySQL at www.ritmark.com/ritmark/index.php. The concept is that the whole filesystem is made to look like a relational table, with each file being a set of properties and each property having a name and a value, so that each file is represented by a series of rows in the table. To extract the relevant extra data from a file type, you'll need a driver for that file type; for example, MP3 files will need a different driver from PDFs. It isn't quite a fit-and-forget solution yet, but it does show what could be done. After installing the Ritmark code, you can query your filesystem using a simple SQL command such as:
SELECT a.filename, b.filename FROM test.c$$ritmarktest a, test.c$$ritmarktest b
RitmarkFS will only work with MySQL, which isn't such a great limitation, since this is a very popular database (and free, too, which never hurts a product's popularity). However, it seems that no work has been done on the project since November 2006. I wonder why?
Secure thoughts
Security is never far from the headlines these days, whether it's the latest chunk of data lost by a government department or a story about a group of hackers attempting to gain access to bank accounts. All such stories are of concern to anyone who develops an application that's either used by others or runs on a machine connected to the internet, so that covers all your web applications. While not directly a security risk, machine-generated entries to web forms can be used to find machines that are vulnerable to certain exploits, to say nothing of the annoyance of removing these postings from the web server's database. For some time, there's been a technology that's given us a reasonable way of preventing such inputs: CAPTCHA (Completely Automated Public Turing test) can tell computers and humans apart by using a dynamically generated code word whose image is made from distorted characters - the user has to recognise and retype the word to confirm they're a real person, since no current OCR machines can read such heavily distorted images.
I say "currently" because there's a game going round that asks users to enter CAPTCHA codes that make an image of a woman appear with fewer clothes as each code is entered. The purpose of this apparently harmless "game" is to collect a database of CAPTCHA codes and their ASCII equivalents, so that future machines will be able to read them. Using the vast number of users on the internet to do work that's currently impossible for a computer to achieve is a growing idea - encourage real people to enter information so that the computers can "learn" from it. If at first this sounds a little too like science fiction, take a look at our old friend Amazon and one of its current beta projects.
