Difference between revisions of "Structured Query Language in Python"
(This article provides an introduction to structure query language SQL) and compares database management systems which can interact with SQL) |
|||
(One intermediate revision by one other user not shown) | |||
Line 1: | Line 1: | ||
'''THIS ARTICLE IS STILL IN EDITING MODE''' | '''THIS ARTICLE IS STILL IN EDITING MODE''' | ||
− | This article will provide an overview | + | This article will provide an overview of the functioning of Structured Query Language (SQL), and discuss which package in Python should be used under which circumstances. First, we will have a look at how SQL actually works! |
==Introduction to SQL== | ==Introduction to SQL== | ||
Line 10: | Line 10: | ||
A relational database is a collection of data elements having specified relationships following the relational model. This model for data organization was developed by E. F. Codd in 1970 and pursues the goal of making data from large databases accessible without knowledge about its structure and location in the database. | A relational database is a collection of data elements having specified relationships following the relational model. This model for data organization was developed by E. F. Codd in 1970 and pursues the goal of making data from large databases accessible without knowledge about its structure and location in the database. | ||
− | The relational model allows data to be organized into tables. Each table contains one type of record which is described by a set of attributes. In the tables each attribute is represented by a column whereas each record is represented by a row. Relationships between the records of different tables are established using primary and foreign keys. Primary keys are unique identifiers for each row of a table. Foreign keys contain the primary key of another table and thus link the rows of the tables (Codd 1970). | + | The relational model allows data to be organized into tables. Each table contains one type of record which is described by a set of attributes. In the tables, each attribute is represented by a column whereas each record is represented by a row. Relationships between the records of different tables are established using primary and foreign keys. Primary keys are unique identifiers for each row of a table. Foreign keys contain the primary key of another table and thus link the rows of the tables (Codd 1970). |
− | Figure 1 shows an exemplary relational database | + | Figure 1 shows an exemplary relational database that could be applied in a hospital. Its first table contains the patients of the hospital (rows) which are described by patient_id, name, address, and insurance_number (columns). In the second table, the treatments (rows) are stored, described by treatment_id, drug, intakes_per_day, and patient_id as attributes (columns). The first column of both tables contains a unique identifier for each record which is the primary key. The patient ID in the treatments table is a foreign key since it contains a primary key from the patients' table and thus allows to link each treatment to a patient. |
===Figure 1: Exemplary Relational Database=== | ===Figure 1: Exemplary Relational Database=== | ||
Line 18: | Line 18: | ||
==Query Relational Databases with SQL== | ==Query Relational Databases with SQL== | ||
− | The following sections | + | The following sections provide an overview of the main SQL elements used for querying relational databases. Code examples based on the exemplary database shown above are used to illustrate the use of the language elements in typical query structures. |
Remark: SQL is standardized by an ISO norm (ISO/IEC 2016). Nonetheless, the use of the language slightly varies between database systems. | Remark: SQL is standardized by an ISO norm (ISO/IEC 2016). Nonetheless, the use of the language slightly varies between database systems. | ||
===SELECT (DISTINCT) Statement=== | ===SELECT (DISTINCT) Statement=== | ||
− | The SELECT statement is used to retrieve data from a relational database. You can either specify a list of columns | + | The SELECT statement is used to retrieve data from a relational database. You can either specify a list of columns that should be selected from a table or select all columns using a “*”. Furthermore, the source table has to be defined using the FROM clause. Instead of the SELECT statement the SELECT DISTINCT statement can be used in order to show only unique rows in the output table. |
===Code example 1: Simple SELECT statement=== | ===Code example 1: Simple SELECT statement=== | ||
Line 34: | Line 34: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | The query would return a table containing the name, address and insurance number of all patients. | + | The query would return a table containing the name, address, and insurance number of all patients. |
===Aliases=== | ===Aliases=== | ||
Line 40: | Line 40: | ||
===Filtering Data=== | ===Filtering Data=== | ||
− | Data can be filtered by using the WHERE clause. It is followed by a boolean function | + | Data can be filtered by using the WHERE clause. It is followed by a boolean function that outputs “True” for all records which should be included and “False” for those which should be excluded. |
Such Boolean functions usually contain comparisons of attributes with fixed values or other attributes. Comparisons are expressed using comparison operators (table 1). | Such Boolean functions usually contain comparisons of attributes with fixed values or other attributes. Comparisons are expressed using comparison operators (table 1). | ||
Line 80: | Line 80: | ||
==Sorting Data== | ==Sorting Data== | ||
− | The rows of the output table can be sorted using the ORDER BY keyword. The keyword is followed by the columns by which the output should be sorted. Thereby, the order of the columns | + | The rows of the output table can be sorted using the ORDER BY keyword. The keyword is followed by the columns by which the output should be sorted. Thereby, the order of the columns determines the priority with which they are considered. The keywords ASC and DESC after a column determine whether the values are put in ascending or descending order. |
===Code example 3: ORDER BY keyword=== | ===Code example 3: ORDER BY keyword=== | ||
Line 92: | Line 92: | ||
===Joins=== | ===Joins=== | ||
− | Joins are used to connect rows from different tables based on matching attributes. | + | Joins are used to connect rows from different tables based on matching attributes. Different types of joins are differentiated by the rows contained in the output table. Common join clauses are the (INNER) JOIN, LEFT (OUTER) JOIN, RIGHT (OUTER) JOIN, and FULL (OUTER) JOIN. Inner joins only return rows that match a row from the other table. Full joins return all rows of both tables. Left joins return all rows from the left table and the matching rows from the right table. Right joins work vice versa. Thereby, the table mentioned before the join clause is considered to be the left table. Join clauses are used in conjunction with the ON keyword which is followed by the join condition defining one or more attributes that will be checked for matching. |
Remark: In case of similar column names in two or more source tables, column names must be supplemented by table names. Aliases can help to make those references shorter (code example 4). | Remark: In case of similar column names in two or more source tables, column names must be supplemented by table names. Aliases can help to make those references shorter (code example 4). | ||
Line 111: | Line 111: | ||
ON t.patient_id = p.patient_id | ON t.patient_id = p.patient_id | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | The query would return all treatments, described by drug and intakes_per_day as well as the | + | The query would return all treatments, described by drug and intakes_per_day as well as the patient's name and insurance_number if there is an entry with a matching patient_id in the patients' table. |
==Aggregation functions== | ==Aggregation functions== | ||
Line 135: | Line 135: | ||
|} | |} | ||
Table 2 | Table 2 | ||
− | The GROUP BY clause is used to define attributes that form the groups whose values will be aggregated (see table 2). If such attributes are not defined, the whole table gets aggregated | + | The GROUP BY clause is used to define attributes that form the groups whose values will be aggregated (see table 2). If such attributes are not defined, the whole table gets aggregated into one row. |
===Code example 5: Aggregation functions and GROUP BY clause=== | ===Code example 5: Aggregation functions and GROUP BY clause=== | ||
Line 149: | Line 149: | ||
==SQL in Python== | ==SQL in Python== | ||
− | There are several packages you can choose from to use SQL in Python. The two most important ones are ''SQLite'' and oracle. We will have a look at these two | + | There are several packages you can choose from to use SQL in Python. The two most important ones are ''SQLite'' and oracle. We will have a look at these two options and discuss their advantages and disadvantages. |
===SQLite=== | ===SQLite=== | ||
− | SQLite is a relational database management system (RDBMS). It can also be used as a software library, which is a topic for another article. The "Lite" refers to the lightness of setup, administration and required resources, SQLite stores the data in a file that can be used on multiple operating systems, which is not the case for many systems making use of SQL. Deploying it is as simple as linking a library and creating a regular file, this fact causes a high capacity of accessibility. | + | SQLite is a relational database management system (RDBMS). It can also be used as a software library, which is a topic for another article. The "Lite" refers to the lightness of setup, administration, and required resources, SQLite stores the data in a file that can be used on multiple operating systems, which is not the case for many systems making use of SQL. Deploying it is as simple as linking a library and creating a regular file, this fact causes a high capacity of accessibility. |
<syntaxhighlight lang="Python" line> | <syntaxhighlight lang="Python" line> | ||
Line 157: | Line 157: | ||
import sqlite3 | import sqlite3 | ||
− | # Connect to an SQLite | + | # Connect to an SQLite database. If no such file exists, one will be created |
conn = sqlite3.connect('example.db') | conn = sqlite3.connect('example.db') | ||
Line 164: | Line 164: | ||
(date text, trans text, symbol text, qty real, price real)''') | (date text, trans text, symbol text, qty real, price real)''') | ||
− | #The triple quotes | + | #The triple quotes allow spanning the statement over multiple lines. "CREATE TABLE IF NOT #EXISTS are case-insensitive SQL keywords that have been written in capital letters to make #the code more readable. A table with five columns is created. The "text" or "real" behind the #name of the columns determines whether the values should be text or floating-point numbers. |
# Insert some data into the table | # Insert some data into the table | ||
Line 186: | Line 186: | ||
* Does not need any configuration or installation and license | * Does not need any configuration or installation and license | ||
* The size is less than 500kb | * The size is less than 500kb | ||
− | * It is serverless and does not need server process | + | * It is serverless and does not need a server process |
− | * Enables You to work on multiple | + | * Enables You to work on multiple databases in the same session |
− | * It is a cross-platform system and does not need specific platform to run, such as Linux | + | * It is a cross-platform system and does not need a specific platform to run, such as Linux |
* Easy to backup | * Easy to backup | ||
* It's atomic and transactional property which means if part of the operation is failed, all the successful operations will be reverted to the original state. | * It's atomic and transactional property which means if part of the operation is failed, all the successful operations will be reverted to the original state. | ||
Line 197: | Line 197: | ||
* Loose approach towards the datatype handling, allows the insertion of invalid values, you can insert a string in an integer column | * Loose approach towards the datatype handling, allows the insertion of invalid values, you can insert a string in an integer column | ||
* It does not support all the datatypes for example there is no Boolean or Datetime | * It does not support all the datatypes for example there is no Boolean or Datetime | ||
− | * The previous drawback can cause your data be invalid when moving it from SQLite to another platform | + | * The previous drawback can cause your data to be invalid when moving it from SQLite to another platform |
− | * On the same file, simultaneous write | + | * On the same file, simultaneous write is impossible |
− | * The previous limit | + | * The previous limit causes lockage when writing on file and it reduces performance in heavy scenarios |
* Database-level user management is less applicable in SQLite | * Database-level user management is less applicable in SQLite | ||
− | * There are a lot of limitation in maximum usage of features for instance the maximum number of attached | + | * There are a lot of limitation in the maximum usage of features for instance the maximum number of attached databases are 125, the maximum number of rows in a table is 264, the number of tables in a schema is 2147483646, the maximum number of columns is 32767 in a table, the maximum number of tables in a join is 64 tables, maximum number of arguments on a function is 127, the SQLite dataset is limited to 281 terabytes, which is why it is not a good choice for large scale projects |
− | * SQLite stores the entire data in a single file which makes it hard to handle | + | * SQLite stores the entire data in a single file which makes it hard to handle under filesystem limitations |
− | * Due to the database accessibility, it may require more work to | + | * Due to the database accessibility, it may require more work to provide the security of |
private data | private data | ||
Line 212: | Line 212: | ||
==What is Oracle?== | ==What is Oracle?== | ||
− | Oracle database is another (RDBMS). It stocks the largest part of the current market among relational | + | Oracle database is another (RDBMS). It stocks the largest part of the current market among relational database management systems. |
The company provides multiple editions including: | The company provides multiple editions including: | ||
Enterprise Edition, Standard Edition, Express Edition (XE), Oracle Lite, Personal Edition. | Enterprise Edition, Standard Edition, Express Edition (XE), Oracle Lite, Personal Edition. | ||
− | As Oracle is | + | As Oracle is an RDBMS software, it is built on SQL and has some additional extensions. Oracle is a fully scalable database architecture. It is therefore being used by big enterprises which require management and process within their local network that is spread across the world. It also has its own network components to facilitate connections across different networks. |
===Benefits=== | ===Benefits=== | ||
− | * Cost reduction: It Enables you to consolidate several databases into one which can cause shrinking operations and costs. You have the option of cloud bursting when peaks occur, which means that they can temporally increase their cloud resources. This increases the | + | * Cost reduction: It Enables you to consolidate several databases into one which can cause shrinking operations and costs. You have the option of cloud bursting when peaks occur, which means that they can temporally increase their cloud resources. This increases the resistance to crashes |
* One single database is easier to manage and maintain | * One single database is easier to manage and maintain | ||
* High availability and performance, including performance optimization features | * High availability and performance, including performance optimization features | ||
Line 224: | Line 224: | ||
* User control and identity management that is hard to reproduce which means high safety standards | * User control and identity management that is hard to reproduce which means high safety standards | ||
* Quick backup and recovery options to protect and reconstruct data | * Quick backup and recovery options to protect and reconstruct data | ||
− | * Flashback technology: this option allows to recover missed or lost data, | + | * Flashback technology: this option allows to recover of missed or lost data, decreases human errors, and makes the administration much easier |
* Reliability and portability: It can run on almost 20 network protocols and 100 hardware platforms easily | * Reliability and portability: It can run on almost 20 network protocols and 100 hardware platforms easily | ||
− | * Market presence: Oracle is the most famous and largest RDBMS | + | * Market presence: Oracle is the most famous and largest RDBMS vendor with more experience and researchers, tons of supported third parties, and professional staff |
* Support for enterprise applications | * Support for enterprise applications | ||
− | * Oracle DB is compatible with the major platforms, including Windows, UNIX, Linux and macOS. The Oracle database is supported on multiple operating systems, including HP-UX, Linux, Microsoft Windows Server, Solaris, SunOS and macOS | + | * Oracle DB is compatible with the major platforms, including Windows, UNIX, Linux, and macOS. The Oracle database is supported on multiple operating systems, including HP-UX, Linux, Microsoft Windows Server, Solaris, SunOS, and macOS |
===Limitations=== | ===Limitations=== | ||
* Complexity: It requires specialized skills to install and maintain due to its complex engine | * Complexity: It requires specialized skills to install and maintain due to its complex engine | ||
− | * Cost: While it can save costs due to the option to consolidate databases into one, it might also be costly if the potential of the offers | + | * Cost: While it can save costs due to the option to consolidate databases into one, it might also be costly if the potential of the offers is not fully exhausted |
− | * Difficult to manage: It is generally more difficult in managing | + | * Difficult to manage: It is generally more difficult in managing certain operations |
* Not suitable for small or medium size databases. | * Not suitable for small or medium size databases. | ||
* High hardware requirements | * High hardware requirements | ||
Line 239: | Line 239: | ||
Oracle Database supports SQL language to interact with the database. It's considered one of | Oracle Database supports SQL language to interact with the database. It's considered one of | ||
the best databases because it supports all data types like relational, graph, structured, and | the best databases because it supports all data types like relational, graph, structured, and | ||
− | unstructured information. | + | unstructured information. It is apparently a wise choice for large-scale projects and can |
− | support large quantities of data. From announcement of 256 | + | support large quantities of data. From the announcement of 256 big companies, such as Netflix, Linkedin, ebay, it can be concluded that it is appropriate for a large amount |
of data. | of data. | ||
Line 246: | Line 246: | ||
“10 Benefits of Oracle’s Data Management Platform.” n.d. https://www.oracle.com/a/ocom/docs/10-benefits-of-oracle-data-management-platform.pd | “10 Benefits of Oracle’s Data Management Platform.” n.d. https://www.oracle.com/a/ocom/docs/10-benefits-of-oracle-data-management-platform.pd | ||
− | Codd, E.F. 1970. "A Relational Model of Data for Large Shared Data Banks." *Communications of the ACM* 13, no. 6: 377-387. [https://dl.acm.org/doi/pdf/10.1145/362384.362685](https://dl.acm.org/doi/pdf/10.1145/362384.362685) | + | Codd, E.F. 1970. "A Relational Model of Data for Large Shared Data Banks." *Communications of the ACM* 13, no. 6: 377-387. [https://dl.acm.org/doi/pdf/10.1145/362384.362685](https://dl.acm.org/doi/pdf/10.1145/362384.362685) |
“Introduction to Oracle Database,” n.d. https://docs.oracle.com/database/121/CNCPT/intro.htm. | “Introduction to Oracle Database,” n.d. https://docs.oracle.com/database/121/CNCPT/intro.htm. | ||
Line 280: | Line 280: | ||
“What Is SQLite and When to Use SQLite.” 2021. Simplilearn.com. July 22, 2021. https://www.simplilearn.com/tutorials/sql-tutorial/what-is-sqlite. | “What Is SQLite and When to Use SQLite.” 2021. Simplilearn.com. July 22, 2021. https://www.simplilearn.com/tutorials/sql-tutorial/what-is-sqlite. | ||
− | |||
− | |||
The [[Table of Contributors|authors]] of this entry are XX Reinhardt and Hajar Zare. Edited by Milan Maushart | The [[Table of Contributors|authors]] of this entry are XX Reinhardt and Hajar Zare. Edited by Milan Maushart |
Latest revision as of 12:28, 3 September 2024
THIS ARTICLE IS STILL IN EDITING MODE
This article will provide an overview of the functioning of Structured Query Language (SQL), and discuss which package in Python should be used under which circumstances. First, we will have a look at how SQL actually works!
Contents
Introduction to SQL
SQL is the standard database language for relational databases. It is used for retrieving and manipulating data. When analyzing data, SQL is great to retrieve and organize data. It is used by many different institutions, such as hospitals and large companies. Basically anywhere, where data needs to be analyzed! To use SQL in Python, you need to choose a package. We will have a look at the different options later. For now, we will focus on how SQL works.
Relational Databases
A relational database is a collection of data elements having specified relationships following the relational model. This model for data organization was developed by E. F. Codd in 1970 and pursues the goal of making data from large databases accessible without knowledge about its structure and location in the database.
The relational model allows data to be organized into tables. Each table contains one type of record which is described by a set of attributes. In the tables, each attribute is represented by a column whereas each record is represented by a row. Relationships between the records of different tables are established using primary and foreign keys. Primary keys are unique identifiers for each row of a table. Foreign keys contain the primary key of another table and thus link the rows of the tables (Codd 1970).
Figure 1 shows an exemplary relational database that could be applied in a hospital. Its first table contains the patients of the hospital (rows) which are described by patient_id, name, address, and insurance_number (columns). In the second table, the treatments (rows) are stored, described by treatment_id, drug, intakes_per_day, and patient_id as attributes (columns). The first column of both tables contains a unique identifier for each record which is the primary key. The patient ID in the treatments table is a foreign key since it contains a primary key from the patients' table and thus allows to link each treatment to a patient.
Figure 1: Exemplary Relational Database
Query Relational Databases with SQL
The following sections provide an overview of the main SQL elements used for querying relational databases. Code examples based on the exemplary database shown above are used to illustrate the use of the language elements in typical query structures.
Remark: SQL is standardized by an ISO norm (ISO/IEC 2016). Nonetheless, the use of the language slightly varies between database systems.
SELECT (DISTINCT) Statement
The SELECT statement is used to retrieve data from a relational database. You can either specify a list of columns that should be selected from a table or select all columns using a “*”. Furthermore, the source table has to be defined using the FROM clause. Instead of the SELECT statement the SELECT DISTINCT statement can be used in order to show only unique rows in the output table.
Code example 1: Simple SELECT statement
SELECT name, address, insurance_number FROM patients
The query would return a table containing the name, address, and insurance number of all patients.
Aliases
Aliases can be assigned to tables and columns using the AS keyword. Renaming columns can help to make output tables more readable (code example 5); aliases for tables are used to make queries shorter and clearer (code example 4).
Filtering Data
Data can be filtered by using the WHERE clause. It is followed by a boolean function that outputs “True” for all records which should be included and “False” for those which should be excluded.
Such Boolean functions usually contain comparisons of attributes with fixed values or other attributes. Comparisons are expressed using comparison operators (table 1).
Table 1: Common comparison operators
Operator | Description |
---|---|
= | Equal to |
> | Greater than |
< | Less than |
>= | Greater than or equal to |
<= | Less than or equal to |
<> | Not equal to |
(Source: adapted from w3schools 2022b)
Missing values can be identified by the IS NULL operator returning “True” for all records having a null value in the data field of interest. IS NOT NULL is the negation of this operator.
Expressions can be linked with the logical operators AND and OR. AND returns "True" if both linked expressions are true; OR returns "True" if one of those is.
Code example 2: WHERE clause and filter condition
SELECT * FROM treatments WHERE intakes_per_day > 1 AND drug = “ibuprofen” AND patient_id IS NOT NULL
The query would return a table with all treatments where ibuprofen is taken more than one time per day and the patient_id is not missing.
Sorting Data
The rows of the output table can be sorted using the ORDER BY keyword. The keyword is followed by the columns by which the output should be sorted. Thereby, the order of the columns determines the priority with which they are considered. The keywords ASC and DESC after a column determine whether the values are put in ascending or descending order.
Code example 3: ORDER BY keyword
SELECT * FROM treatments ORDER BY intakes_per_day DESC
The query would return all treatments and order them by descending intakes_per_day.
Joins
Joins are used to connect rows from different tables based on matching attributes. Different types of joins are differentiated by the rows contained in the output table. Common join clauses are the (INNER) JOIN, LEFT (OUTER) JOIN, RIGHT (OUTER) JOIN, and FULL (OUTER) JOIN. Inner joins only return rows that match a row from the other table. Full joins return all rows of both tables. Left joins return all rows from the left table and the matching rows from the right table. Right joins work vice versa. Thereby, the table mentioned before the join clause is considered to be the left table. Join clauses are used in conjunction with the ON keyword which is followed by the join condition defining one or more attributes that will be checked for matching.
Remark: In case of similar column names in two or more source tables, column names must be supplemented by table names. Aliases can help to make those references shorter (code example 4).
Figure 2: Common joins
(Source: adapted from w3schools 2022a)
Code example 4: LEFT JOIN
SELECT p.name, p.insurance_number, t.drug, t.intakes_per_day FROM treatments AS t LEFT JOIN patients AS p ON t.patient_id = p.patient_id
The query would return all treatments, described by drug and intakes_per_day as well as the patient's name and insurance_number if there is an entry with a matching patient_id in the patients' table.
Aggregation functions
Aggregation functions aggregate the values of a specified column over several rows. Common aggregation functions are listed in table 2.
Common aggregation functions
Function | Description |
---|---|
MIN() | returns the minimal value |
MAX() | returns the maximal value |
SUM() | returns the sum of the values |
AVG() | returns the average of the values |
COUNT() | counts the values |
COUNT(DISTINCT) | counts the unique values |
Table 2 The GROUP BY clause is used to define attributes that form the groups whose values will be aggregated (see table 2). If such attributes are not defined, the whole table gets aggregated into one row.
Code example 5: Aggregation functions and GROUP BY clause
SELECT patient_id, COUNT(DISTINCT drugs) AS number_of_drugs, SUM(intakes_per_day) AS total_intakes_per_day FROM treatments GROUP BY patient_id
The query would count the different drugs administered and sum up the intakes per day over all drugs per patient_id.
SQL in Python
There are several packages you can choose from to use SQL in Python. The two most important ones are SQLite and oracle. We will have a look at these two options and discuss their advantages and disadvantages.
SQLite
SQLite is a relational database management system (RDBMS). It can also be used as a software library, which is a topic for another article. The "Lite" refers to the lightness of setup, administration, and required resources, SQLite stores the data in a file that can be used on multiple operating systems, which is not the case for many systems making use of SQL. Deploying it is as simple as linking a library and creating a regular file, this fact causes a high capacity of accessibility.
# import the package for SQLite import sqlite3 # Connect to an SQLite database. If no such file exists, one will be created conn = sqlite3.connect('example.db') # Create a table conn.execute('''CREATE TABLE IF NOT EXISTS stocks (date text, trans text, symbol text, qty real, price real)''') #The triple quotes allow spanning the statement over multiple lines. "CREATE TABLE IF NOT #EXISTS are case-insensitive SQL keywords that have been written in capital letters to make #the code more readable. A table with five columns is created. The "text" or "real" behind the #name of the columns determines whether the values should be text or floating-point numbers. # Insert some data into the table conn.execute("INSERT INTO stocks VALUES ('2023-04-19', 'BUY', 'AAPL', 100, 136.71)") # Commit the changes conn.commit() # Query the table cursor = conn.execute("SELECT * FROM stocks") for row in cursor: print(row) # Close the database connection conn.close()
Here are some benefits and limitations of SQLite:
Benefits
- SQLite is an in-memory open-source library
- Does not need any configuration or installation and license
- The size is less than 500kb
- It is serverless and does not need a server process
- Enables You to work on multiple databases in the same session
- It is a cross-platform system and does not need a specific platform to run, such as Linux
- Easy to backup
- It's atomic and transactional property which means if part of the operation is failed, all the successful operations will be reverted to the original state.
- Database can be recovered from system crashes and power outage
- Resilient to out-of-memory and storage errors
Limitations
- Loose approach towards the datatype handling, allows the insertion of invalid values, you can insert a string in an integer column
- It does not support all the datatypes for example there is no Boolean or Datetime
- The previous drawback can cause your data to be invalid when moving it from SQLite to another platform
- On the same file, simultaneous write is impossible
- The previous limit causes lockage when writing on file and it reduces performance in heavy scenarios
- Database-level user management is less applicable in SQLite
- There are a lot of limitation in the maximum usage of features for instance the maximum number of attached databases are 125, the maximum number of rows in a table is 264, the number of tables in a schema is 2147483646, the maximum number of columns is 32767 in a table, the maximum number of tables in a join is 64 tables, maximum number of arguments on a function is 127, the SQLite dataset is limited to 281 terabytes, which is why it is not a good choice for large scale projects
- SQLite stores the entire data in a single file which makes it hard to handle under filesystem limitations
- Due to the database accessibility, it may require more work to provide the security of
private data
When to Use SQLite?
- SQLite is a good choice when you want to combine the SQL querying and storage ability with the ease of access
- Can be a good selection in environments where end users should not be aware of data existence
- As SQLite requires no installation and configuration, beginners can use SQLite for the learning purpose
What is Oracle?
Oracle database is another (RDBMS). It stocks the largest part of the current market among relational database management systems. The company provides multiple editions including: Enterprise Edition, Standard Edition, Express Edition (XE), Oracle Lite, Personal Edition. As Oracle is an RDBMS software, it is built on SQL and has some additional extensions. Oracle is a fully scalable database architecture. It is therefore being used by big enterprises which require management and process within their local network that is spread across the world. It also has its own network components to facilitate connections across different networks.
Benefits
- Cost reduction: It Enables you to consolidate several databases into one which can cause shrinking operations and costs. You have the option of cloud bursting when peaks occur, which means that they can temporally increase their cloud resources. This increases the resistance to crashes
- One single database is easier to manage and maintain
- High availability and performance, including performance optimization features
- Secure hybrid cloud environment offers more options by providing more tools than the other platforms
- User control and identity management that is hard to reproduce which means high safety standards
- Quick backup and recovery options to protect and reconstruct data
- Flashback technology: this option allows to recover of missed or lost data, decreases human errors, and makes the administration much easier
- Reliability and portability: It can run on almost 20 network protocols and 100 hardware platforms easily
- Market presence: Oracle is the most famous and largest RDBMS vendor with more experience and researchers, tons of supported third parties, and professional staff
- Support for enterprise applications
- Oracle DB is compatible with the major platforms, including Windows, UNIX, Linux, and macOS. The Oracle database is supported on multiple operating systems, including HP-UX, Linux, Microsoft Windows Server, Solaris, SunOS, and macOS
Limitations
- Complexity: It requires specialized skills to install and maintain due to its complex engine
- Cost: While it can save costs due to the option to consolidate databases into one, it might also be costly if the potential of the offers is not fully exhausted
- Difficult to manage: It is generally more difficult in managing certain operations
- Not suitable for small or medium size databases.
- High hardware requirements
When to Use Oracle?
Oracle Database supports SQL language to interact with the database. It's considered one of the best databases because it supports all data types like relational, graph, structured, and unstructured information. It is apparently a wise choice for large-scale projects and can support large quantities of data. From the announcement of 256 big companies, such as Netflix, Linkedin, ebay, it can be concluded that it is appropriate for a large amount of data.
References
“10 Benefits of Oracle’s Data Management Platform.” n.d. https://www.oracle.com/a/ocom/docs/10-benefits-of-oracle-data-management-platform.pd
Codd, E.F. 1970. "A Relational Model of Data for Large Shared Data Banks." *Communications of the ACM* 13, no. 6: 377-387. [1](https://dl.acm.org/doi/pdf/10.1145/362384.362685)
“Introduction to Oracle Database,” n.d. https://docs.oracle.com/database/121/CNCPT/intro.htm.
ISO/IEC. 2016. "ISO/IEC 9075-1:2016(en): Information technology — Database languages — SQL — Part 1: Framework (SQL/Framework)." [2](https://www.iso.org/obp/ui/#iso:std:iso-iec:9075:-1:ed-5:v1:en)
Koulianos, Petros. 2020. “5 Reasons to Use SQLite the Tiny GIANT for Your next Project.” The Startup. August 9, 2020. https://medium.com/swlh/5-reasons-to-use-sqlite-the-tiny-giant-for-your-next-project-a6bc384b2df4.
Nguyen, Spencer. 2022. “The Benefits of Oracle DBMS for Your Organization.” DreamFactory Software- Blog. May 25, 2022. https://blog.dreamfactory.com/the-benefits-of-oracle-dbms-for- your-organization/.
Singh, Hemendra. 2019. “Mobile App Development Company | Web Development Company USA India.” The NineHertz: August 14, 2019. https://theninehertz.com/blog/advantages-of-using-oracle-database.
“SQLite Pros and Cons: A.” 2022. ThinkAutomation. February 17, 2022. https://www.thinkautomation.com/our-two-cents/sqlite-pros-and-cons-a/.
“SQLite Python.” 2018. SQLite Tutorial. 2018. https://www.sqlitetutorial.net/sqlite-python/.
Team, CherryRoad Consultancy. 2021. “What Are the Top 8 Reasons to Use an Oracle Database?” CherryRoad Technologies. October 22, 2021. https://www.cherryroad.com/2021/10/22/oracle-database-cloud/.
w3schools. "SQL Joins." Accessed December 23, 2022a. [3](https://www.w3schools.com/sql/sql_join.asp)
w3schools. "SQL Operators." Accessed December 23, 2022b. [4](https://www.w3schools.co /sql/sql_operators.asp)
“What Are the Limitations of SQLite.” n.d. Www.dbtalks.com. https://www.dbtalks.com/tutorials/learn-sqlite/what-are-the-limitations-of-sqlite.
“What Is Oracle? - Definition from WhatIs.com.” n.d. SearchOracle. https://www.techtarget.com/searchoracle/definition/Oracle.
“What Is an Oracle Database? | NetApp.” https://www.netapp.com/oracle/what-is-oracle-database/.
“What Is SQLite?” n.d. Codecademy. https://www.codecademy.com/article/what-is-sqlite.
“What Is SQLite and When to Use SQLite.” 2021. Simplilearn.com. July 22, 2021. https://www.simplilearn.com/tutorials/sql-tutorial/what-is-sqlite.
The authors of this entry are XX Reinhardt and Hajar Zare. Edited by Milan Maushart