2022. 10. 2. 23:25ㆍData science/Database
Database means knowing precisely what the data means and its worth. In other words, it is a structured collection of meaningful data.
1. Timeline of databases
| 1960 - 70s (In-company applications) | 1980 - 2000 (Wider Applications) | 2000 + social (Figures are monthly active users) |
| COBOL, SQL(1976) numeric/textual data accounting usually per-company no standardisation data access is slow (think about the old computer data storage tool, i.e.taping) simple indexing CODASYL(the Conference/Committee on Data Systems Languages, consortium) |
SQL Standard(1986) MS Access(1991) Maturation of the relational model Media(Netflix(1997), iTunes(2001)) NoSQL(1998) |
Online banking and eCommerce Text, Photos, music, videos Views, likes, comments Sharing, Messaging Realtime streaming "friend" network, suggestions. |
2. Some definitions

Data: meaningful information
DBMS: System
Database system : DBMS + Data + interface(App, front-end)
3. SQL(Structured Query Language)
: ANSI/ISO standard language for relational DB Manipulation
: from 1970, called SEQUEL(Structured English Query language)
4. NoSQL
: Not a traditional relational DB
: Very large DB where performance is crucial
5. DB ranking: Oracle, MySQL, MS SQL, PostgresQL, MongoDB
6. Typical DBMS functionality
| Define a DB | Manipulate a DB |
| Construct a DB | Share a DB |
* CRUD(Create, Read, Update, Delete)
7. Relational model
- a database is a collection of relations
- a relation is a table of values with rows and columns
- tables are accessed and linked together with keys.
8. Table(aka relation) Schema: No actual data within it
: With a table of attributes and data types, which gives a data structure.
9. Anatomy of table

10. Keys - a fundamental idea
: Uniquely identify a row in a table, and create a relationship between tables. - PK, Candidate Key, FK
10.1 Primary Key: Uniquely identifies a row in a table, Underlined
- How to choose PK? 1. Identify a set of candidate keys ->2. Select PK from them.
- Among Candidate keys, we look for the minimal(simplest) key(If possible, not a combined one)
- Choosing a primary key Rules
| Must be unique | no Null | Obviousness: keep it simple |
| if possible; no set. One attribute |
Numbers are faster! | Once chosen, try not to change it. |
10.2 Candidate Key: A set of attributes that can uniquely identify a row
- Should not be changeable, null and should be unique, precise
10.3 Foreign Key: an attribute in one table(parent) which is used as the PK to another table(child)
- a corresponding value must be in the child
- Careless deletion or insertion might destroy the relationship between the two tables.
10.4 Integrity constraints: DBMS will apply key integrity constraints
| 1. Prevent from setting a PK to Null |
| 2. Prevent having TWO PKs with the same value in the same table. |
| 3. Prevent parents' FK from having a value which does not occur in the child table. |
'Data science > Database' 카테고리의 다른 글
| Week 1. MetaData & ParaData (0) | 2022.10.09 |
|---|---|
| Week 1. Data Sources & Data Resolution (0) | 2022.10.09 |
| [IBM]Databases and SQL for Data Science with Python - JOIN Statements (0) | 2021.05.09 |
| [IBM]Databases and SQL for Data Science with Python - ACID TRANSACTIONS (0) | 2021.05.09 |
| [IBM]Databases and SQL for Data Science with Python - Views, Stored Procedures (0) | 2021.05.09 |