Talk:SQL/Archive 2

Latest comment: 17 years ago by SqlPac in topic "Cartesian Join"
Archive 1Archive 2Archive 3


Who was first and whose great idea was it?

The dates for the System/38 were relatively easy to find. All I can find for Oracle is summer 1979, including on the Oracle web site. Seems awfully fishy to me that a company that continues to use this as a selling point today, doesn't give a specific date. Doesn't really matter who was first, but if this page is going to support Oracle's claim of being the first RDBMS, shouldn't there be a date to back it up. Also shouldn't this page hightlight the fact that Oracle didn't invent the idea, but created their version off concepts being developed by IBM. Instead we get a less than correct statement about how Oracle beat IBM by two years. Also how about some statement to the fact were mainframe and Unix worlds eventually were brought on board, but it was the midrange groups (System/38-SystemR(DB2) and VAX(Oracle) that moved forward with using RDBMS systems.

Oracle claims to be the first *commercial* RDBMS. I don't think they ever claimed to be the first implementation. IBM had been doing experiments for some time before Oracle released their first RDBMS I believe.

Given link (http://www.dbdebunk.com/ Tutorial-D - A language by C. J. Date and colleagues) don't point to information about Tutorial-D. Please add more precise url or delete it.

Agreed - I remove the link Turnstep 18:56, 4 October 2005 (UTC)

Introduction

"The language has evolved beyond its original purpose to support object-relational database management systems." Is this correct? I read this as though the original purpose of SQL was to support object-relational DBMS's. I thought it was for relationals DBMS's and OODBMS's are a recent addition.

Am I mis-reading it or am I incorrect in my SQL history?

Thanks --Doc0tis 22:03, 14 December 2005 (UTC)

Don't know. See this reference on it: ChiMu Corp. - Foundations of Object-Relational Mapping DBBell 19:51, 15 December 2005 (UTC)
You are right, it should be "...support relational database management systems." I'll change it. Pukkie 07:27, 6 September 2006 (UTC)
Perhaps "The language has evolved beyond its original purpose of supporting relational database management systems to support object-relational database management systems" is closer to the original intent, and contains both intended original pieces of information, if it is rather unwieldy. Perhaps "The language has evolved beyond its original purpose of supporting relational database management systems and now supports object-relational database management systems as well"? J A Treloar 83.146.21.253 18:18, 1 February 2007 (UTC)
I've bitten the bullet and changed it J A Treloar 83.146.21.253 18:27, 1 February 2007 (UTC)

Over-reliance on "NULLs"

The article says that one flaw of SQL is "Over-reliance on "NULLs", which some consider a flawed or over-used concept."

I can't seem to find anything in the references here to substantiate that assertion. I've met DBAs and develoeprs who use NULL inappropriately. But the author says that SQL itself is overly reliant on NULL. Can we have an example or a citation that explains how the language itself is reliant on NULL? -- Mikeblas 09:09, 30 December 2005 (UTC)

-- This is a really good question. I'd like to hear an answer myself. Scotto 09:05, 30 December 2005 (UTC)

-- Sorry, that was me. I didn't sign my comment. If we don't hear anything, I think that line should be stricken as POV and/or unsubstantiated. -- Mikeblas 09:09, 30 December 2005 (UTC)

-- I concur --Doc0tis 19:16, 5 January 2006 (UTC)

I disagree that NULLs are bad, but here's a cite: [1].--SarekOfVulcan 19:47, 5 January 2006 (UTC)

-- I was just reading the SQL page, and I would have liked to have seen more explanation about this. However I use SQL and know about NULL values, I don't know what is meant by Over-reliance on "NULLs". --Bernard François 07:47, 7 January 2006 (UTC)

  • I strongly contest the statement that NULLs are flawed or overused, in and of themselves. Is it a mistake that C and Java make references nullable by default? Yes. Is it a mistake that mainstream compilers don't statically typecheck nullable types in a safe way? Yes. There are situations in SQL and other languages where NULLs are misused or abused, but as many situations where they're useful for describing a piece of information that's, for example, present for most but not all of the records in a table. The quoted text is a sweeping generalization that should be qualified or omitted. Deco 22:04, 11 February 2006 (UTC)
    • C doesn't have references. C++ doesn't allow comparing pointers to two unlreated types, so I can't understand what you mean by compilers "don't statically typecheck nullable types in a safe way". Despite this, I agree with you -- nobody else has defended (or even explained) the assertion in the article, so I've removed it. -- Mikeblas 04:54, 26 February 2006 (UTC)
      • I assume what Deco meant by "statically typecheck nullable types" is that C and C++, unlike SQL, don't let you say "this pointer is allowed to be null" or "this pointer is not allowed to be null". I agree that the SQL notion of NULL is theoretically and quite often practically useful, because "this element's value is the empty string" and "this element has no value" are two distinct, different concepts. (Nonetheless, like any sophisticated and mildly-subtle distinction, the "NULL" versus "empty" distinction has great potential to trip up people who aren't attuned to it, leading predictably to (a) bogus code and then (b) claims that the distinction is "bad".) —Steve Summit (talk) 03:29, 13 March 2006 (UTC)

The language doesn't rely on NULL - it's just a useful abstraction. Countersubject 21:49, 16 November 2006 (UTC)

Numbering for DDL, DML and DCL

Since there are three parts withing the Data Sublanguage, they should be numbered (numbers 1,2,3 at the beginning of the title). I don't know the syntax to do this in a proper way (if there is a proper way). Can someone please do this? I think it would make the structure of the article more clear. --Bernard François 07:59, 7 January 2006 (UTC)

Is Null a cell value or a cell flag?

I have heard everybody saying that null is a value but I don't think so. Don't you think it is a flag of a particular cell rather than value of that cell. It is that flag which determine whether there is residing some particular value in the cell or not. Why we must compare a null(a flag) with a value e.g. some people say that null=any value, null>any value and null<any value yield no result. Why we would compare a cell flag with cell value. SQL has to maintain the null flag as it is the necessity of the database system. Some people say that SQL is like three valued - true, false and null but I don't agree with that fact becuase true and false is totally different from null flag. What do you say about that. I am confused about the fact whether null is a value or a flag. If somebody has thorogh understanding of the Null then discuss here please. Thanks.

See Null (SQL) for a discussion of SQL Nulls.SqlPac 03:35, 17 May 2007 (UTC)
Partly it comes down to definitions. I'm not a SQL expert, but I do tend to think of NULL more like a flag; it indicates that there is no value. But you do have to define how NULL participates in comparisons, because if I say "SELECT FROM foo WHERE bar > 0", we have to know what happens if there's a row where bar is NULL.
But I suspect a lot of people think of NULL as a "value", even though what it means is that there is no value.
Another way to think of it is that NULL is a lot like the NaN which appears in many floating-point representations. NaN is a special value that means "Not a Number". NaNs have several special and peculiar properties, such as that a NaN does not compare equal to a NaN. So is NaN a "value" or a "flag"? It certainly acts like a flag, and what it means is, by definition, that it is not a number. But you can also say that it's a value, albeit a special one. —Steve Summit (talk) 03:38, 13 March 2006 (UTC)
If you think in terms of polymorphism, a flag and a value get mapped onto the same concept: values which behave differently according to their dynamic definition. Cfr the Null Object Pattern Wouter Lievens 09:41, 13 March 2006 (UTC)
Null is not a value: it is not an element of any data domain. Most of the problems with Null (three-valued logic, etc.) stem from that fact. GregorB 14:32, 16 November 2006 (UTC)
It doesn't matter. Null is the absence of a value, or "unknown." Whether the database engine implements this as a special value ( which SQL Server seems to - it does a constant lookup when you use Is Null or Is Not Null in your query ) or as a flag decorating the value. What matters is that it's logically consistant, not how it's physically implemented inside the black box. DigitalEnthusiast 19:29, 22 December 2006 (UTC)

SQL is not a relational database

SQL is not a relational database according to the relational database model.

-- Leandro GFC Dutra 13:21, 18 April 2006 (UTC)

You're right, SQL is not a relational database. It's a language for using relational databases. Goplat 04:09, 15 April 2006 (UTC)
Double wrong. SQL deals with merely SQL databases. The ISO SQL standards do not even use the relation word anymore, because it does not apply in any meaningful, precise sense.
--
Leandro GFC Dutra 13:21, 18 April 2006 (UTC)
SQL may not be relational if you're using a strict definition, but most people consider it to be close enough to be called relational, and the differences have practical reasons behind them. Goplat 18:01, 18 April 2006 (UTC)

SQuirreL

The page currently claims that "SQuirreL" is a popular alias/pronounciation for SQL. I have never heard it before, but then again, I live in a country where English is not the main language. How can the claim be verified? If it can't, shouldn't it be removed? TroelsArvin 13:57, 21 May 2006 (UTC)


I live in Canada (Toronto Area) and have never heard SQuirreL either...--65.93.236.218 17:31, 14 July 2006 (UTC)

never --207.54.103.130 18:05, 23 August 2006 (UTC)

I think it's a secret Countersubject 21:40, 16 November 2006 (UTC)

One of my (ab)users terms it SQuirreL. J A Treloar 83.146.21.253 13:13, 6 February 2007 (UTC)

Sounds like a marketing campaign by Next Generation Software, makers of a software package called "NGSSQuirreL". "SQuirreL" is not a common pronunciation for SQL in English (unless you have a really, really, really heavy U.S. Southern accent, possibly). SqlPac 03:25, 17 May 2007 (UTC)

Added pronunciation notes to lede

I noted an oddity about "a/an SQL" in the linguistics article clitics, then decided that it was noteworthy here as well. MaxEnt 04:37, 12 June 2006 (UTC)

Misplaced Voltaire reference, but to what extent?

I thought about revising the paragraph with the Voltaire reference to read:

One joke about SQL is that "SQL is not structured, nor is it limited to queries, nor is it a language," (which recalls Voltaire's remark that the Holy Roman Empire was "neither holy, nor Roman, nor an empire.")

(which is how Gibbon quotes it, IIRC), but it really seems like a kind of pointless reference. Those who recognize it will recognize it without wikipedia's help and those who don't probably don't care to be distracted by links to Voltaire and the Holy Roman Empire while learning about SQL. So it occurred to me to remove it entirely but I'm not sure about that. Anybody else have any ideas on this or feel strongly enough to go ahead one way or the other?

Anyway - either way, its appearance as a displaced afterthought tacked onto the end of the paragraph isn't good.

It would be awful if someone who was trying to learn something about databases accidentally learned something about Voltaire or the HRE, wouldn't it. Next thing you know, people will become well-rounded. Kaleja 05:27, 29 September 2006 (UTC)

"Cartesian Join"

The article makes reference to something called a "Cartesian Join," which isn't properly linked or explained. It's a very jargon-ish sounding term, and if it's not explained within the article, then it should be linked to someplace where it is explained in full. I can't even find another article that uses the term, though. The closest thing I can find is a mention of a Cross Join on the main Join (SQL) page, which apparently is a join that returns the Cartesian product of the two tables (as opposed to an inner join or outer join), but it's not clear if that's the same as a "Cartesian Join." I think that's what's meant by 'Cartesian Join,' but I'm not sure enough that I want to link it. This is confusing and should be cleared up by somebody familiar with the topic. If a "Cartesian Join" is the same as a Cross Join, then it should be linked that way (or the 'cartesian join' term should be replaced with 'cross join,' and the latter linked to an explanation). Anyone want to fix this? --Kadin2048 17:32, 2 August 2006 (UTC)

The term "cartesian product" is fairly common. I've never heard "cartesian join" from a reliable source. "Cross join" is an SQL feature which performs a cartesian product. --Craig Stuntz 18:43, 2 August 2006 (UTC)
A cartesian join is the same as cartesian product. That's what they called it in my DB class at college. --Doc0tis 18:10, 23 August 2006 (UTC)
Oracle's Explain Plan on a query resulting in a cartesian product describes its operation as a 'Merge Join Cartesian', so that's almost a "Cartesian Join". -anon
A cartessian join is an exploding join. These articles shouldn't be written by people unfamiliar with the sources. Edited, yes, because the article should be understandable by all ... but this would be a little like me telling a contractor I don't believe that bricks exist. Select * From Table_A Join Table_B is an inner join, even though the query doesn't explicitly say so. The same is true of cartessian joins. They're usually implicit, and most often by accident. Other than creating test data, there's very little real need to pair every combination of values from two ( or more ) lists. DigitalEnthusiast 23:54, 21 December 2006 (UTC)
I believe that a Cartesian Join is shorthand for "a join that creates a Cartesian Product". I find the two separate links to Cartesian Product and to Join (SQL) irritating. I would find a new article on "Cartesian Join" or a parenthetic note, more to my taste. However, if no-one agrees with my belief that Cartesian Join is a shorthand, and it is impossible to verify then my irritation must remain. J A Treloar 83.146.21.253 18:13, 1 February 2007 (UTC)
No such thing as a "cartesian join" in SQL. A SQL "Cross Join" produces a Cartesian product of two tables. An "Inner Join" first produces a Cartesian product of two tables and then eliminates rows that do not match the join criteria. Etc. So while Cartesian product is used by SQL (and in fact, specified by the ISO SQL Standard), there is no such thing as a "Cartesian join". I don't even know if "Cartesian join" is still in the article, but if so it needs to go bye-bye. SqlPac 18:59, 8 June 2007 (UTC)

Designer credit in the infobox?

Why does Codd get designer credit? He didn't author the language, just the concept of an RDBMS. If I write a new access language for an RDBMS, is he also the Designer?

Also, how come IBM gets credit, while the actual authors of the paper do not? —The preceding unsigned comment was added by Vonfraginoff (talkcontribs) .

I agree on both points. --Craig Stuntz 15:58, 20 September 2006 (UTC)

comparison with fourth-generation programming languages

The article says that questions like 'Who had the top ten scores?' are more difficult to program in SQL than languages like SAS.

With the introduction in ISO SQL 2003 of window functions ROW_NUMBER() and RANK(), I believe this question becomes a lot easier in SQL. Would it be worth mentioning this? Lincoln Hannah 04:43, 19 October 2006 (UTC)

Yes, but that doesn't help if you're using SQL Server 2000. Version 2005 ( aka 9 ) allows this, and Oracle has provided RowNumber since forever.
The question is flawed, though. It could be easily answered in SQL: "SELECT TOP 10 * FROM HIGH_SCORES;". What would be very difficult in SQL without ranking functions ( or pre-crunched data ) but very easy in a procedural language like VB or C# would be "Who had the 10th highest score?" SQL works in result sets and without resorting to complicated tricks like nested subqueries or cursors, this would be easier done by hand than in SQL. DigitalEnthusiast 23:58, 21 December 2006 (UTC)
select * from high_scores order by score desc limit 1 offset 9 - that doesn't seem overly complex. MichaelSpeer 19:49, 31 January 2007 (UTC)

The language syntax is rather complex

"The language syntax is rather complex (sometimes called "COBOL-like")" Do people actaully think this? I have never heard anyone say this. It seems simple enough to meDoc0tis 17:58, 2 November 2006 (UTC)

It depends on where people are coming from. Non-programmers aside, those with a background in procedural languages can initially find declarative languages difficult - it's the paradigm-shift thing. However, I wouldn't expect a BNF grammar for SQL to be intrinsically more complex than one for, say, Java or C#. Countersubject 23:50, 2 November 2006 (UTC)
Perhaps we could say something like, "The grammar is complex (per BNF grammar specification) and in many ways resembles COBOL's key-word based syntax. Some argue that this makes it more English-like and thus "natural" such that it is easier to learn and read; but may also make interpretors or compilers more difficult to write, less conducive to automatic SQL generation, and more difficult to be DBA-extendable." {Anonymous Donor}
I'd strike the sentence. I've never heard anyone who's become comfortable with the paradigm call the syntax complex (or COBOL's, for that matter). Purely on syntax I'd say it's less complex than C/C++ (the pointers) and Perl, though more complex than Fortran or Prolog.--Rudd73 22:13, 13 November 2006 (UTC)
Done Countersubject 08:46, 14 November 2006 (UTC)
SQL has what, 25 keywords? I can point out two or three people who find tieing their shoes rather complex, but that doesn't make it encyclopedic! DigitalEnthusiast 00:01, 22 December 2006 (UTC)

Whitemarsh copy of SQL:2003

I dont think it is appropriate for Wikipedia to link to the late draft available at http://www.wiscorp.com/, as the copyright status of this is not stated, and as a result wikipedia may currently be subverting copyright:


John Vandenberg 07:08, 15 November 2006 (UTC)

Are you suggesting that a link to a copyrighted resource 'subverts' copyright? In that case google, wikipedia and a plethora of other sites have some serious law suits coming their way.--Grimboy 18:13, 27 November 2006 (UTC)

Removing [allegedly] untrue information

From the criticism section, I've removed the following bullet points, because either they're not true, or they're opinion, which really doesn't belong in an encyclopedia. Saying "update vs insert syntax is confusing" is like saying "Megan is really hot." The syntax is well documented, well known to many, and like riding a bike. The other three, however, are simply untrue for the vast majority of SQL databases, and the ones a person is most likely to find themselves using. I've added reasons why this is untrue in parens after each statement I've removed.

  • It does not provide a standard way, or at least a commonly-supported way, to split large commands into multiple smaller ones that reference each other by name. This tends to result in "run-on SQL sentences" and may force one into a deep hierarchical nesting when a graph-like (reference-by-name) approach may be more appropriate and better repetition-factoring. ( This is the reason views, stored procedures, and functions were created - SQL is logic and data centric, not pysical implementation centric. ) Yes, but one needs special previlages to do this, and often such a needs is a one-query use rather than something that needs to muck up the permanent SP list.
  • For larger statements, it is often difficult to factor repeated patterns and expressions into one or fewer places to avoid repetition and avoid having to make the same change to different places in a given statement. ( This is what "User Defined Functions" aka "UDFs" were created to address. )
  • The difference between value-to-column assignment in UPDATE and INSERT syntax is puzzling and confusing to many.
  • The language cannot easily be extended by programmers or DBAs. Although some variations allow the addition of functions, the functions can only take scalar values and not tables (real or virtual) as arguments. The archaic syntax (above) may be part of the reason for this. ( This is simply untrue, at least in 2006. SQL Server, Oracle, and MS Access, the three RDBMS solutions I've used, all provide UDFs ( via VBA with MS Access ) and Stored Procedures ( although Access implements them as macros that fire off a series of independant queries ). ) —The preceding unsigned comment was added by DigitalEnthusiast (talkcontribs) 00:20, 5 December 2006 (UTC).

Allegedly? Is the sky allegedly blue? Does gravity allegedly pull people downward and prevent us from flying off into space? Although knowledge of SQL is a bit more specialized than knowledge of gravity, both Newton's theory and what I wrote are true. ( Although Newton's math has been updated some. ) Anybody who has worked much with structured query language will recognize that these are more than allegedly true, just as any biologist will tell you humans need to breathe air to survive. This is just good old-fashioned common sense. DigitalEnthusiast 22:41, 18 December 2006 (UTC)

Formally

Are there any sources for the claim that pronouncing SQL like sequel is "less formal" than pronouncing it like S-Q-L? Claims about formality are difficult to establish neutrally, so it seems dubious, and I have a mind to remove it unless some citation can be provided to back it up. Nohat 03:46, 12 December 2006 (UTC)

It is less formal by definition, because the letters read "SQL" not "sequel". People only pronounce it "sequel" because they hear other people pronouncing it that way, and in turn those other people only pronounce it that way because of the history behind the acronym. Otherwise it could just as easily be pronounced "squeal", "squall" or any number of other, better words than "sequel". 24.6.99.30 14:28, 14 December 2006 (UTC)
JPEG is pronounced "jay-peg" not "J-P-E-G", and there is nothing "informal" about that pronunciation, so this argument seems a little weak. Are there any verifiable, reliable sources that describe the pronunciation "sequel" as "informal" or "less formal"? Nohat 02:53, 25 December 2006 (UTC)

Removing "original research" / garbage (AGAIN)

The text below had to be removed.

Different syntax between UPDATE and INSERT is for good reason, if the complainer would take then 15 minutes required to learn. SQL is one of the simplest computer languages in existance, but as will all languages, some learning is required.

The rest is a rant and original research that has no place in an encyclopedia. And one of the most absurd example cases I've ever read. If a person wants to fill two printed pages with complains about "typo finder" queries being difficult, Google Blogger is a much better soapbox than Wikipedia.

DigitalEnthusiast 00:18, 22 December 2006 (UTC)

Garbage that needed to be disposed of:


'The difference between value-to-column assignment in UPDATE and INSERT can result in confusion and added work for automated SQL code generation modules.

It does not provide a standard way, or at least a commonly-supported way, to split large commands into multiple smaller ones that reference each other by name. This tends to result in "run-on SQL sentences" and may force one into a deep hierarchical nesting when a graph-like (reference-by-name) approach may be more appropriate and better repetition-factoring. (Views, and stored procedures can help with this, but often require special database privileges and are not really meant for single-query usage.) Here is an illustration for a "typo finder" query: Sample Table "codeTable"

locat code descript


---- --------

10 AA Foo Bar 20 AA Foo Baar 30 AA Foo Bar 10 BB Glab Zab 20 BB Glab Zab ...etc... Sample Query to Find Data-Entry Errors

select * from codeTable where locat not in (30, 50) and code not in

 (
  select code
  from 
    (select code, descript     --(gets unique code-and-descript combos)
     from codeTable
     where locat not in (30, 50)
     group by code, descript
    )
  group by code
  having count(*) = 1
 )

order by code, locat Here we have a table of codes in which we want to find and study typos in the descriptions that are supposed to repeat for each location. For example, the second row in the sample data has the typo "Baar". (Perhaps the repetition is bad normalization, but sometimes one has to deal with such data from old systems.)

In this case we want to ignore codes from location 30 and 50 because we know they are not being used right now and thus we don't care to inspect them. To do it properly, we have to apply the filter in two different places. A language which can create a temporary sub-query and then reference that sub-query by name would result in better repetition factoring, reducing the number of places that the filtering criteria have to be applied or changed, such as if we have to change the location exclusion list. Hypothetical fix:

select * into $temp from codeTable where locat not in (30, 50); -- select * from $temp where code not in

 (
  select code
  from 
    (select code, descript     --(gets unique code-and-descript combos)
     from $temp
     group by code, descript
    )
  group by code
  having count(*) = 1
 )

order by code, locat; Here the name "$temp" is a temporary or virtual table that is used only by this group of queries. The list of exception locations only has to be kept in one place. The difference is not significant for this example, but if the filtering criteria (WHERE clause) is complex then the difference becomes significant.