Learning SQL the Badass Way II
By Edgar Treischl in SQL
March 21, 2021
Some vocabulary
This is the second blog post from “Learn SQL the badass way”. I outlined the scope of the blog in the first blog entry where I explained why I set the scope to R users and people with some programming languages. You may not find what you are looking for if you are not familiar with R or if you do not have any other experience to work with data, because I focus on the data management part in this blog.
Before we get in touch with new SQL commands, we have to learn some SQL vocabulary. As in other languages, we have to learn some basic vocabulary to advance our SQL skills. Let’s say we have a small collection of tables about books. In the author
table we store information about the book author
(e.g. first name, second name, etc); in the book
table contains information about each book (e.g. genre etc); and the sales
table summarizes the sales data for each book. How do we manage all of these tables and the dependencies in SQL?
Sometimes you encounter a diagram to display how each table is related to each other and we can think of all the independent tables as a collection of tables and we have to figure out how they are related. A entity relationship diagram (ERD) will help you to see the relation of each table, it displays the collection of entities and attributes.
In the SQL world, entities are independent objects. For instance, the book table is a independent object because it exits next to other entities of our collection. Entities have attributes or properties. For instance, the book table contains title, subtitle, book id and more attributes. Thus, entities refer to tables (or in my world data frame) of our collection and attributes refer to columns, or I would say variables, in the table.
Furthermore, we can differentiate between primary and foreign keys in the tables:
- Primary key: Is a unique indicator that helps us to match tables (e.g. a unique author ID)
- Foreign keys: Is a primary key that is defined in other table to create a link between tables (Book ID in Book table and the sales table)
We may also differentiate what information a attribute stores. Some common data formats are:
- Char (for characters)
- Varchar (for variable character length)
- Numeric (Integer, time)
Types of SQL Statements
In the SQL world, we can ultimately distinguish between the data definition and manipulation language:
- Data Definition Language (DDL):
- Commands from the DDL are used to define, change, or drop tables (database)
- SQL examples: Create, Alter, Truncate, Drop
- Data Manipulation Language (DML):
- DML is used to read and modify data in tables
- Those operations are sometimes named as CRUD operations and we learned them in the last blog: Create, Read, Update, and Delete rows in a table
- SQL examples: INSERT, SELECT, UPDATE, DELETE
Now, let’s put some of these concepts into practice:
CREATE
- You can create new tables with
CREATE TABLE
command. It works in three steps. You have to provide a name for your table, each column needs a name, and you have to specify which kind of information will be stored (e.g. numerical values, characters) in the column - The following command creates a toy table with for petsales with five variables:
CREATE TABLE PETSALE (
ID INTEGER NOT NULL,
PET CHAR(20),
SALEPRICE DECIMAL(6,2),
PROFIT DECIMAL(6,2),
SALEDATE DATE
);
As the output illustrates, we can add options to create the table:
- The
ID
variable is an integer that does not accept zero, or in other words:NOT NULL
- The column PET is generated to store to character string
- The column SALEDATE stores dates
- And we could also set a primary key with the clause:
PRIMARY KEY
- A second example
CREATE TABLE PET (
ID INTEGER NOT NULL,
ANIMAL VARCHAR(20),
QUANTITY INTEGER
);
- So far, both tables do not contain any values. With
INSERT INTO
, we fill the table with corresponding values:
INSERT INTO PETSALE VALUES
(1,'Cat',450.09,100.47,'2018-05-29'),
(2,'Dog',666.66,150.76,'2018-06-01'),
(3,'Parrot',50.00,8.9,'2018-06-04'),
(4,'Hamster',60.60,12,'2018-06-11'),
(5,'Goldfish',48.48,3.5,'2018-06-14');
- And for the second table:
INSERT INTO PET VALUES
(1,'Cat',3),
(2,'Dog',4),
(3,'Hamster',2);
As we learned in the last session, we can use SELECT
to check whether it worked:
SELECT * FROM PET;
Table: Table 1: 3 records
ID | ANIMAL | QUANTITY |
---|---|---|
1 | Cat | 3 |
2 | Dog | 4 |
3 | Hamster | 2 |
ALTER TABLE
- We use the
ALTER TABLE
statement to add, delete, or modify columns. For instance: ADD COLUMN, DROP COLUMN; ALTER COLUMN; RENAME COLUMN
First: ADD COLUMN
ALTER TABLE PETSALE
ADD COLUMN QUANTITY INTEGER;
SELECT * FROM PETSALE;
Table: Table 2: 5 records
ID | PET | SALEPRICE | PROFIT | SALEDATE | QUANTITY |
---|---|---|---|---|---|
1 | Cat | 450.09 | 100.47 | 2018-05-29 | NA |
2 | Dog | 666.66 | 150.76 | 2018-06-01 | NA |
3 | Parrot | 50.00 | 8.90 | 2018-06-04 | NA |
4 | Hamster | 60.60 | 12.00 | 2018-06-11 | NA |
5 | Goldfish | 48.48 | 3.50 | 2018-06-14 | NA |
- Again, fill in your values
UPDATE PETSALE SET QUANTITY = 9 WHERE ID = 1;
UPDATE PETSALE SET QUANTITY = 24 WHERE ID = 5;
- Check whether it worked
SELECT * FROM PETSALE;
Table: Table 3: 5 records
ID | PET | SALEPRICE | PROFIT | SALEDATE | QUANTITY |
---|---|---|---|---|---|
1 | Cat | 450.09 | 100.47 | 2018-05-29 | 9 |
2 | Dog | 666.66 | 150.76 | 2018-06-01 | NA |
3 | Parrot | 50.00 | 8.90 | 2018-06-04 | NA |
4 | Hamster | 60.60 | 12.00 | 2018-06-11 | NA |
5 | Goldfish | 48.48 | 3.50 | 2018-06-14 | 24 |
Second: DROP COLUMN
ALTER TABLE PETSALE
DROP COLUMN PROFIT;
SELECT * FROM PETSALE;
Table: Table 4: 5 records
ID | PET | SALEPRICE | SALEDATE | QUANTITY |
---|---|---|---|---|
1 | Cat | 450.09 | 2018-05-29 | 9 |
2 | Dog | 666.66 | 2018-06-01 | NA |
3 | Parrot | 50.00 | 2018-06-04 | NA |
4 | Hamster | 60.60 | 2018-06-11 | NA |
5 | Goldfish | 48.48 | 2018-06-14 | 24 |
Third: ALTER COLUMN
- We can change the data type, for instance, to increase the length of a character variable to VARCHAR(20) with
ALTER COLUMN
ALTER TABLE PETSALE
ALTER COLUMN PET SET DATA TYPE VARCHAR(20);
SELECT * FROM PETSALE;
Table: Table 5: 5 records
ID | PET | SALEPRICE | SALEDATE | QUANTITY |
---|---|---|---|---|
1 | Cat | 450.09 | 2018-05-29 | 9 |
2 | Dog | 666.66 | 2018-06-01 | NA |
3 | Parrot | 50.00 | 2018-06-04 | NA |
4 | Hamster | 60.60 | 2018-06-11 | NA |
5 | Goldfish | 48.48 | 2018-06-14 | 24 |
Forth: RENAME COLUMN
- Use
RENAME COLUMN
to change to a new name:
ALTER TABLE PETSALE
RENAME COLUMN PET TO ANIMAL;
SELECT * FROM PETSALE;
Table: Table 6: 5 records
ID | ANIMAL | SALEPRICE | SALEDATE | QUANTITY |
---|---|---|---|---|
1 | Cat | 450.09 | 2018-05-29 | 9 |
2 | Dog | 666.66 | 2018-06-01 | NA |
3 | Parrot | 50.00 | 2018-06-04 | NA |
4 | Hamster | 60.60 | 2018-06-11 | NA |
5 | Goldfish | 48.48 | 2018-06-14 | 24 |
Truncate
- The
TRUNCATE
statement will remove all(!) rows from an existing table, just like the one we created in the beginning, however, it does not delete the table itself.
TRUNCATE TABLE PET IMMEDIATE;
Caution: DROP TABLE tablename;
drops the entire table!
DROP TABLE PETSALE;