SQL
SQL (/ˈɛs kjuː ˈɛl/ or /ˈsiːkwəl/) is a programming language for managing data in a database management system (DBMS). In other words, it is alphabet soup for managing data in another alphabet soup. It is a multi-decade attempt to use standards bodies and large corporations to try to disprove the old adage that "too many cooks spoil the soup."
SQL originally stood for Structured Query Language, until the world's finest computer scientists ruled that it wasn't structured, and any queries one might make with it would probably be rejected as mal-formed with a minimum of explanation. As for its status as a language: it is often described as a "declarative language" (4GL) — that is, a language where the programmer, removing his finger from his nose to fidget with his face-piercings, simply states the result he wants to achieve and never has to indicate how it might come about — but it winds up working as poorly as your 4GL cellphone lets you actually make out what your caller is saying.
Informix and Teradata claim that their SQL stands instead for Stored Query Language. This acknowledges that there is nothing terribly "structured" about SQL, although things you create with it, you might be able to "store" — and even find at a later date, provided you don't use SQL.
In 1986, SQL became a standard of the American National Standards Institute (ANSI). The next year, it went on a global tour, fronting for an unknown act called HTML at the International Organization for Standardization (ISO). Each of these standards purports to define SQL exactly, and it remains astonishing that at no time in the following three decades could one write anything in SQL fancier than "Hello, World!" that would run on all SQL servers.
History[edit]
The initial impetus for SQL might have been a design set out in an influential 1970 Codd piece entitled, A Relational Model of Large Shared Data Banks. Or it may have been Heinlein's 1966 paper, The Database Is a Harsh Mistress.
No matter who gave them the idea, SQL was initially developed by Chamberlain and Boyce in the early 1970s, a period for which few records exist. It was designed to play pranks on data stored under System R, another internal project that IBM could only hope also remained swept under the rug.
These initial efforts ignored the fact that neither payroll records nor song collections were apt for treatment as a table, and if they were forced into such a form, no one would want to do things like insert, delete, or specify custom views.
SQL was originally named SEQUEL, but was changed to SQL because SEQUEL was a trademark for an airplane, and mimicking the name would lead people to expect the project to take flight. Happily, the name-change saved 3 bytes every time the language was referred to. This is the main reason for its wide use.
Another key reason for the adoption of SQL is that COBOL was falling out of favor. Programmers were used to the verbosity of COBOL, a language they could show their bosses in the business suits and make them think they understood though being unable to do the job themselves. SQL fit this bill because everything about it looks vaguely like English, although nothing about it is understandable. For example: Debt IS NOT DISTINCT FROM - Receivables
. We all know those words, and we all can therefore "read SQL" — yet no one has a clue what this piece of it is trying to do.
Theoretical model[edit]
Codd's seminal paper posited a relational model (pictured) and a tuple calculus. The reader will remember learning these concepts in grade school — and practicing them on the playground while his more successful classmates were in the bushes playing doctor.
Shockingly, however, SQL deviates in several ways from this theoretical foundation. Rather than "tuples," for example, SQL operates on tables with rows, which the SQL programmer can actually grasp. And theoretical SQL was useful, whereas commercial implementations are an overt waste of time.
Nevertheless, the SQL programmer grew up believing that Jesus himself wanted you to spill your guts to the priest in a small booth, and that Star Trek could be "rebooted" into a movie that got nothing right provided it had Khan and a few tribbles. Therefore, programmers easily cope with having only modern SQL to use, rather than The Real Thing.
Critics argue that SQL should be replaced with a language based on the original foundation. Some of these critics point out that, with Obama-care and a new, super-high minimum wage, it will soon be infeasible to employ human beings at all. At that point, all payroll records might be replaced with "tuples," on which "calculus" might be possible. Some of these critics are consulting companies, perfectly situated to re-code all of the Personnel Department's SQL software. While the world waits for such a replacement language to be devised, the consultants will loiter around the office, drinking n-tuple helpings of coffee from the machine, and prepare additional suggestions by which the business can be "optimized."
Syntax[edit]
SQL comprises clauses, expressions, predicates, and "insignificant whitespace," the goal, again, being to produce reams of code that is "readable," while not understandable. A typical SQL statement selects information out of a table by specifying the table's name. The statement defines what information to extract, what changes to make to it, then obliterates it completely using clauses such as SLICE BY
and then DICE BY
. Tabular data can then be puréed and fricaseed.
In Dragnet, Officer Joe Friday began each broadcast by stating, "The story you are about to see is true. The names have been retrieved with a SQL query to protect the innocent."
SQL queries may combine data from multiple tables. There never being one best way to do anything, SQL lets you specify how the tables should be combined: one after the other, shuffling as you would shuffle a deck of cards, shuffling with half the data "on the floor" as you would shuffle after midnight when drunk, joined but putting really useless rows up for auction on eBay, and joined with the removal of the One Percent who aren't paying enough taxes.
A briefly reassuring aspect of SQL is that these queries are not persistent. Making a query does not permanently change any table, much as you might want to. (Proprietary SQL products therefore have features that break this rule.) Non-persistence means that, if you were to fire a revolver into a SQL server, any changes would go away, and with luck, before the police arrived to ensure that you did too.
Obtaining and massaging data from tables, like reading Wikipedia, may be educational, but it is nothing one's boss will pay one to do for eight hours a day. The programmer can create additional tables based on queries, but that is merely an invitation to do more of the same. Therefore, SQL contains many additional features. Data extracted from a table can be printed on the Line Printer, or can be emailed to a large list of unwilling recipients. In addition, any of the console sense lights can be turned on or off based on the results of a query, and that always feels like an accomplishment.
Extensions[edit]
Despite all these bells and whistles, SQL remains patently useless. For this reason, SQL implementors, the Henry III of their day, have stretched SQL in sundry directions. Unfortunately, using any of these extensions renders a SQL application incapable of running anywhere else.
Subqueries[edit]
Queries can be nested so that the results of one query are available for use in another. On all but the most anal-retentive implementations, not only subqueries but sub-subqueries and sub-sub-subqueries can be defined.
The perky *
syntax can be used to make a query depend on its own results. This makes it easy to code an infinite loop and bog down all the computing resources in the enterprise.
Distributed transactions[edit]
The concept of transactions lets the programmer group changes to the database, so that all of them take effect or (if the programmer is a woman who every day proclaims her "right to change her mind") the transaction can simply be called off and none of the changes take effect. We see an example of a transaction here:
START TRANSACTION;
UPDATE Account SET amount=amount-200 WHERE account_number=1234;
UPDATE Account SET amount=amount+200 WHERE account_number=2345;
IF ERRORS=0 COMMIT;
IF ERRORS<>0 ROLLBACK;
Obviously, one doesn't want to take $200 out of account #1234 unless one can successfully also put it into account #2345. Unless one is Richard Pryor in Superman III and intends to put it in his pocket instead. (Further to Mr. Pryor, any SQL operation can be declared NATURAL
. Seven-come-eleven, baby needs noo shooz, etc.)
The modern corporate employee, who is required to wear a straitjacket without pockets and is led to his cubicle by Brinks guards, cannot steal in such obvious ways. However, in distributed transactions, all of the resources needed to complete a transaction have a lock placed on them until the SQL program decides whether to go through with it or back out of it. The clever programmer can put a lock on Seats 13 and 14 in the middle of Row A at a Justin Bieber concert, ensuring that the lock remains in effect despite never committing the transaction, the seats are unsold through the start of the concert, and he and his date can sit there with a piddling Standing Room ticket.
Three-valued logic[edit]
The utter inability of SQL servers to produce valid answers has led to the development of three-valued logic, in which the answer to any query is either yes, no, or "I haven't got a clue" (in the American locale: "shut the fuck up").
Combining STFU
(or, in Britain, NIL
) with the AND
and/or OR
operators, of course, produces a result of STFU
.
SQL implementors have held entire week-long conferences discussing angels-on-the-head-of-a-pin issues such as whether STFU=STFU
should return yes (because anything is equal to itself) or another, perhaps louder STFU
(as nobody knows whether the first STFU
bears any relation to the second). Few of the idiots savants at these conferences go on to publish papers, but several edit Wikipedia to point out how internally inconsistent SQL itself is. It is, like, a hypocrite.
Break-out groups at these conferences debate whether a single use of the technology is pronounced "a see-quel transaction" or "an ess-cue-ell transaction." There is no open bar.
Controversy[edit]
Standard SQL[edit]
Regardless of whether you obey ANSI SQL-86 or ISO/IEC 9075, no SQL code finally gotten to run on your SQL server will ever run on any other server. To produce "portable SQL," the programmer needs to avoid advanced features and confine the program to basic SQL. For example:
- Do not reference any data types having to do with the current date, time, or time zone. A perfectly good SQL transaction, especially for the banking industry, can assume all distributed transactions occurred in Greenwich, England as of the start of the Civil War.
- You may pull text strings out of a database, but do not try to perform advanced and ambiguous operations on them, such as putting two of them together or searching one for a sub-string.
- As case sensitivity differs among SQL vendors, protect yourself by using neither uppercase nor lowercase letters in your SQL program. All variable names should use nothing but numerals. For the SQL keywords themselves, capitalize everything, because it looks as though you are shouting, and the boss will be impressed at the "take-charge attitude."
Reason for the incompatibilities[edit]
Oddly, there are no comparable problems writing a web page in HTML, and most browsers produce something sensible. There are several reasons why SQL has so many problems:
- Microsoft hosted most of the standards meetings, providing everything from fritters to strippers, so that the representatives of other vendors would approve of features (to "make life easier" for the 97% of the market that was already running Microsoft) that would not run fast unless you either owned or rented a Microsoft technology patent. That is, standard SQL requires everyone either to be Microsoft or to pay Microsoft for not being Microsoft.
- The standards specify syntax more clearly than semantics. That is, they are as demanding on the coder as they are vague for the implementor. For some aspects, they proudly leave implementation completely undefined. For example, on some SQL servers,
ROLLBACK
is achieved by reversing changes and removing locks; on others, it is achieved by detonating small nuclear weapons at the base of the server stack. - And who knew that table names on some servers would be limited to eight USASCII characters?
System views[edit]
Vendors invented system views in view of this system of problems. They decided that, to gain access to one-off features, no one should use one-off SQL statements. It would be better to use rock-solid SQL to open one-off tables. Indeed it was better — in terms of thinking one is an elegant coder, but not in terms of emailing one's code to someone else.
Some of these one-off system views are indexes. So you can open a table of all the names of all the tables. These SQL systems are an entire "page in the style of the thing it's about." Opening an index is the surest and swiftest guarantee that your program will break in any other county, and probably in any future year.
Customers of SQL databases are unwilling to convert from "standard" SQL to a SQL where the code is actually portable. This would require recoding hundreds of thousands of pages of SQL, some of it existing only on punched cards. The entire industrial world attempted a comparable panic drill called Y2K; although it was declared a success, no one slept in his own bed for months (though that is a different story).
SQL vendors don't want to convert either. Management totes up all those monthly checks on support contracts and then imagines a nightmare world where their customers could simply pick up and move to a competitor's system. Before you know it, they are offering to host the next "standards meeting."