3.4 KiB
title, created_date, updated_date, aliases, tags
| title | created_date | updated_date | aliases | tags |
|---|---|---|---|---|
| SQL | 2024-10-28 | 2024-10-28 |
SQL
SQL stands for structured query language and is used to retrieve entries from a Database. Many different implementations exist such as MySQL, SQLite
Crash Course Takeaways
SELECT (DISTINCT) [Column names, or * for everything]
FROM table_name
WHERE selection_statements;
ORDER BY column_name (DESC)
-- only return 20 entries
LIMIT 20
[!Info] DISTINCT statement This will make sure that only unique entries are returned. It is used to filter out duplicates, which is a very important step when doing statistics on a dataset. Be cautious though, depending on the SELECT statement it might filter out actual datapoints instead of duplicates (e.g. SELECT DISTINCT last_name ... would only return one of several siblings)
Filter
- Use selection statement after the WHERE keyword
- e.g.
WHERE last_name="Connor"orWHERE age <= 18orWHERE nationality in ("Swiss", "Italian", "French") AND,OR,NOTare logical operatorsWHERE name Like "%blue%"returns any string with blue in it(%are placeholders)WHERE name Like "____"returns all names with 4 charactersWHERE name IS (NOT) NULL: filter out nulls
Sort
You can sort with the Keyword ORDER BY. If you want to reverse the order you can add the DESC statement at the end.
CASE Statement
The CASE statement allows to change data on the fly (e.g. grouping) without changing the underlying database entries.
SELECT EmployeeName,
CASE
WHEN EmpLevel = 1 THEN 'Data Analyst'
WHEN EmpLevel = 2 THEN 'Middle Manager'
WHEN EmpLevel = 3 THEN 'Senior Executive'
ELSE 'Unemployed'
END
FROM Employees;
Limit Keyword
With the LIMIT keyword you can limit the number of rows the query returns.
Functions
COUNTreturns the number of entries of a query:SELECT COUNT(*) FROM names;SUM: sums all entries of the queryMINMAXAVG
Functions can be executed with subgroups of the returned DB-entries. In order to achieve this the GROUP BY column_name statement is used (just as in the Pandas library).
--- The example counts players from the same team
SELECT Team, COUNT(PlayerID)
FROM Players
GROUP BY TEAM;
If you want to filter a second time within the subgroups of the GROUP BY statement you can use the HAVING keyword.
-- This only uses Account Types with more than 100 Accounts in the uppermost count aggregation
SELECT AccountType, COUNT(AccountID)
FROM Accounts
GROUP BY AccountType
HAVING COUNT(AccountID) > 100;
COALESCE to default nulls
COALESCE: replace nulls with a default value
SELECT AVG(COALESCE(HolidaysTaken, 0))
FROM AnnualLeave;
-- or select on of the 3 numbers as a number
SELECT
CustomerName,
COALESCE(HomePhone, MobilePhone, BusinessPhone) as PhoneNumber
FROM Customers;
Joins
Inner Join (aka. join)
Joins columns of two tables, if and only if, all datapoints exist in both tables.
Left Join (aka. left outer join)
This join includes all rows of the first table even if there is no match in the second table.
INSERT INTO
is used to insert rows into database tables
UPDATE
is used to change an existing database entry
DELETE
is used to delete all entries that the queries returns: careful, this is dangerous!
Examples
- The queries used in Obsidians Dataview are quite similar.