Structured query language (SQL) is a domain-specific, standardized programming language used to interact with relational database management systems (RDBMS) such as MySQL, SQL Server, IBM Db2, PostgreSQL and Oracle Database.
SQL queries (also known as SQL commands or SQL statements) allow users to easily add, retrieve, update, delete, aggregate and otherwise manage data in a relational database (or SQL database). In these systems, structured data is ordered into rows and columns, which together form tables. Data is typically structured across multiple related tables that can be linked using primary or foreign keys.
Unlike other programming languages, SQL is a declarative language, which means it describes what the user wants the computer to do rather than how to achieve it. This contrasts with imperative or procedural languages (such as Java and COBOL) which require step-by-step instructions.
For example, when users write an SQL query to fetch data, they don't need to outline the steps the database should take to gather that data; they just describe what the result should look like:
SELECT name, email
FROM customers
WHERE city = 'New York' ;
SQL is a fundamental and versatile tool in the modern tech stack, known for its data manipulation capabilities, rapid query processing and strong security features. It also offers compatibility across different systems, scalability to handle growing data needs and robust open source and community support.
In the 1970s, IBM scientists Donald Chamberlin and Raymond Boyce developed and introduced SQL. It originated from the concept of relational models and was initially called structured English query language (SEQUEL) before being shortened to SQL. It became commercially available in 1979 and has since become the global standard for relational database management systems.
SQL was standardized by the American National Standards Institute (ANSI) in 1986 and the International Organization for Standardization (ISO) in 1987. Despite being a standard, SQL has various dialects, such as T-SQL for Microsoft SQL Server and PL/SQL for Oracle Database. These SQL dialects meet specific system needs while remaining compliant with the core ANSI standard commands such as SELECT, UPDATE, DELETE, INSERT and WHERE.
Industry newsletter
Stay up to date on the most important—and intriguing—industry trends on AI, automation, data and beyond with the Think newsletter. See the IBM Privacy Statement.
Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.
Since its development in the 1970s, SQL has become the backbone of modern database systems.
Unlike general-purpose programming languages, SQL is purpose-built for relational databases—and relational databases are, in turn, optimized for SQL. This mutual design renders SQL a highly efficient data management tool.
SQL’s declarative nature makes it accessible even to users with limited programming experience, making it an ideal language for beginners. Its widespread use and integration with other programming languages like Python and Java also make it a valuable skill in broader programming and data environments.
Data analysts, data scientists and database administrators regularly use SQL because it excels at tasks such as data processing, data definition, access control, data sharing, data integration and big data analytics.
In data science, SQL is used to create databases that store large data sets needed for data analysis. Its ability to manipulate and retrieve data from these vast, structured datasets is also crucial in the development of artificial intelligence (AI) and machine learning (ML) applications, which depend on high-quality data for training.
By adhering to ACID properties—atomicity, consistency, isolation and durability—SQL helps ensure reliable transaction processing for critical use cases and sensitive data handling. It also supports more accurate data-driven decision-making, advanced analytics and enhanced business intelligence.
SQL offers many advantages, which is why it has remained one of the most widely used and enduring programming languages.
SQL’s simple commands (such as GROUP BY, ORDER BY, GRANT and REVOKE) empower users of all skill levels to work with databases.
SQL indexes and query optimization techniques improve the speed of data retrieval, and subsequently, enhance database performance.
SQL databases include security features such as user authentication, access controls and encryption to protect data.
SQL adheres to ANSI and ISO SQL standards, which help ensure compatibility with various systems and platforms, including cloud environments and big data tools.
SQL can effectively manage both small and large databases, adapting to growing data needs without significant performance loss.
Many SQL databases are open source and supported by a large, active community that contributes to continuous improvement and problem-solving.
A relational database organizes data in a tabular format (rows and columns) and facilitates relationships between different tables. For instance, a customer service database might use separate tables for customer information, purchases, product codes and contacts, linked by keys like a unique customer ID.
SQL allows users to write queries (and subqueries) to manipulate this data. These commands run through several software components during the SQL process:
A parser verifies the correctness of SQL statements and converts them into a format that the database can understand, such as tokenized symbols. This step involves syntax analysis and semantic checking. The parser will also help ensure the user is authorized to perform the operation.
Then, a relational engine—also known as a query optimizer—plans the most efficient data retrieval, modification or addition strategies. It does so by evaluating different query execution plans. It writes the plan in bytecode, which is a virtual machine language. This step is crucial for optimizing database performance and resource use.
Finally, a storage engine processes the bytecode, runs the SQL statement and manages physical data storage. It handles the physical representation of data, including file formats and data buffering. It also returns the result to the user or app. This step helps ensure efficient data access and updates on the disk. This linkage often involves relationships, such as one-to-many or many-to-many, established using primary and foreign keys to help ensure data integrity.
Relational database management systems (also called SQL systems) consist of many components, including:
SQL commands are traditionally divided into the following categories:
Data definition language manages database objects like tables, views and indexes. It defines the structure and organization of the stored data and the relationships among stored data items.
Data manipulation language manages data within databases through operations like INSERT, UPDATE and OUTER JOIN—which add, modify and combine data.
Data control language controls data access through commands like GRANT (to give permissions) and REVOKE (to remove permissions). It can restrict a user's ability to retrieve, add and modify data.
Data query language executes data queries to retrieve information, typically using the SELECT command. It can retrieve specific data items or a range of items.
Transaction control language manages transaction changes to help ensure data integrity and supports ROLLBACK and COMMIT operations for undoing or storing changes, respectively. It is used to coordinate data sharing by concurrent users.
SQL databases support various SQL statements for data operations. However, SQL commands can vary depending on the database, which may use its own SQL syntax.
Basic SQL commands include:
Retrieves data from one or more tables.
SELECT name, email
FROM customers
WHERE city = 'New York' ;This statement retrieves the name and email of all customers who live in New York from the customers table.
Adds new rows to a table.
INSERT INTO customers (name, email, city)
VALUES ('Jane Doe', 'jane.doe@example.com', 'Los Angeles') ;
This statement adds a new row to the customers table with the name 'Jane Doe', email 'jane.doe@example.com' and city 'Los Angeles.'
Modifies existing data in a table.
UPDATE customers
SET email = 'new.email@example.com'
WHERE name = 'John Doe' ;
This statement updates the email of the customer named 'John Doe' in the customers table to 'new.email@example.com.'
Removes rows from a table based on a condition.
DELETE FROM customers
WHERE city = 'Boston' ;
This statement deletes all rows from the customers table where the city is 'Boston.'
Defines a new table and its structure.
CREATE TABLE products ( product_id INT PRIMARY KEY, name VARCHAR(100), price DECIMAL(10, 2) ) ;
This statement creates a new table called products with three columns: product_id as an integer primary key, name as a variable character string up to 100 characters and price as a decimal with ten digits and two decimal places.
Modifies the structure of an existing table.
ALTER TABLE customers
ADD COLUMN birthday DATE ;
This statement adds a new column birthday of type DATE to the existing customers table.
Deletes a table and all its data.
DROP TABLE old_customers ;
This statement deletes the old_customers table along with all its data.
Combines rows from two or more tables based on a related column.
SELECT c.name, p.name AS product_name FROM customers c JOIN orders o ON c.customer_id = o.customer_id JOIN products p ON o.product_id = p.product_id WHERE c.city = 'New York' ;
The SQL JOIN statement retrieves the names of customers and the names of the products they ordered. It joins the customers, orders and products tables based on the customer_id and product_id, selecting only those customers who live in New York.
SQL databases are relational databases, where structured data is stored in rows and tables that are linked in various ways. SQL is the standard language for interacting with these databases.
NoSQL databases (or non-relational databases) emerged in the late 2000s to handle data with less structure. These types of databases (such as MongoDB) offer more flexible data models compared to SQL databases.
Key differences include:
NoSQL databases are horizontally scalable, managing higher traffic by adding more servers. In contrast, SQL databases are traditionally vertically scalable, requiring more powerful hardware to handle increased load.
SQL databases use a table-based structure ideal for multi-row transactions and complex queries across related data, thanks to robust indexing and joining capabilities. NoSQL offers various structures, such as key-value, document, graph or wide column stores, catering to different needs and allowing for more flexibility with semi-structured or unstructured data.
SQL databases are optimized for complex queries with strict data consistency, following the ACID principles. NoSQL databases, which follow the BASE principles (basically available, soft state, eventual consistency), provide faster performance for specific types of data but with different consistency guarantees.
SQL databases are often chosen for applications requiring complex transactions, consistent data and strict schema adherence, like financial systems, e-commerce platforms or CRM databases. NoSQL is preferred for rapidly changing, large-scale or semi-structured data, such as in social networks, real-time analytics or content management systems.
SQL databases have a more extensive range of resources, such as SQL tutorials and community support due to its longer history and widespread adoption. NoSQL often requires less upfront design and can be easier to scale but often requires more custom development for complex querying and data consistency.
SQL databases require careful schema design and can be demanding in terms of maintenance for schema changes. A NoSQL DBMS offers easier scalability and adaptability for schema changes without extensive downtime or restructuring.
Despite the security strengths of many SQL databases, other enterprise applications can be vulnerable to security issues—such as weak authentication, insecure design and misconfiguration. Due to these vulnerabilities, SQL injection remains a real-world threat to organizations.
SQL injection occurs when hackers manipulate SQL queries to access or corrupt database information. Understanding these vulnerabilities and implementing robust security measures is critical for safeguarding SQL data.
Explore IBM Db2 Big SQL, a hybrid SQL-on-Hadoop engine delivering advanced, security-rich data query across enterprise big data sources, including Hadoop, object storage and data warehouses.
Use IBM database solutions to meet various workload needs across the hybrid cloud.
Unlock the value of enterprise data with IBM Consulting, building an insight-driven organization that delivers business advantage.
IBM web domains
ibm.com, ibm.org, ibm-zcouncil.com, insights-on-business.com, jazz.net, mobilebusinessinsights.com, promontory.com, proveit.com, ptech.org, s81c.com, securityintelligence.com, skillsbuild.org, softlayer.com, storagecommunity.org, think-exchange.com, thoughtsoncloud.com, alphaevents.webcasts.com, ibm-cloud.github.io, ibmbigdatahub.com, bluemix.net, mybluemix.net, ibm.net, ibmcloud.com, galasa.dev, blueworkslive.com, swiss-quantum.ch, blueworkslive.com, cloudant.com, ibm.ie, ibm.fr, ibm.com.br, ibm.co, ibm.ca, community.watsonanalytics.com, datapower.com, skills.yourlearning.ibm.com, bluewolf.com, carbondesignsystem.com, openliberty.io