PostgreSQL | Big Query |
SELECT [DISTINCT]
FROM
[WHERE ]
[GROUP BY ]
[HAVING ]
[ORDER BY ]
[LIMIT ]
DuckDB is an open-source SQL database designed for analytical workloads. It excels at data-intensive applications such as data science and machine learning. It is optimized for in-memory performance and designed for parallel processing, allowing for the use of multiple cores for increased performance. DuckDB also supports advanced analytics features such as window functions and recursive queries.
DBML is a markdown language for documenting data models.
How to Create and Manipulate SQL Databases with Python
PostgreSQL, also known as Postgres, is a free and open-source relational database management system emphasizing extensibility and SQL compliance. Database table contents and sizes are substantial, as shown in the limits page. Installations are available for Windows OS (install tutorial), Linux, macOS, and others. Online tutorials are available, as well as online documentation.
Follow the Windows OS install tutorial. It will instruct you to download a installer for Windows OS. Choose the latest version and then click the download icon for Windows x86-64.
PostgreSQL like most SQL databases are organized such that they contain default databases for administration ('postgres', 'template0', or 'template1' in the case of PostgreSQL), and then you create a new database for your project. Your project database will then have one or more tables. Each table has columns (fields) and rows containing data (by field/column and row). A administrator level user named 'postgre' is created by default. You should create other database users with appropriate database access permissions.
IMPORTANT: Do not modify or add tables to the default 'postgres', 'template0', or 'template1' databases.
Create a new database named 'pytestdb' for your project from psql:
# create database pytestdb;
# \list
Make sure you include the semicolon ";" and the end of the SQL command 'create database'.
Begin with this beginner Python to PostgreSQL
" target="_blank">connection tutorial. Keep the password you created for the 'postgres' user in an eWallet.Stack Builder will run after the installation. Choose the installation from the dropdown list ('PostgreSQL ##(x64) on port 5432'). A list of applications will download and then be displayed. You don't need any additional applications, so just click 'Cancel' and continue to click through to exit.
A new Windows menu folder named 'PostgreSQL' will be created from the installation that will include the 'pgAdmin #' app and a 'SQL Shell (psql)' app. Running the 'psql' app as described in the installation tutorial and clicking through the initial prompts will show that a default database named 'postgres' is created on port 5432. Close the command prompt.
Create a Python virtual environment and then use pip to install the package named 'psycopg2' also called psycopg.
py -m pip install psycopg2
py -m pip install requests
Create the following script named 'db.py' in the Python virtual environment folder 'PostgreSQL' to connect to the PostgreSQL database, updating 'password=' with the pasword for user 'postgres':
# Written by: Mark W Kiehl
# http://mechatronicsolutionsllc.com/
# http://www.savvysolutions.info/savvycodesolutions/
#
# Demonstrate using psycopg to create a table and then populate it with
# data using the psycopg function psycopg2.extras.execute_batch.
#
# Portions inspired and derived from the work of Haki Benita https://hakibenita.medium.com/fastest-way-to-load-data-into-postgresql-using-python-d2e6de8b2aaa
#
# https://www.postgresqltutorial.com/postgresql-python/
# https://www.psycopg.org/docs/index.html
# pip install psycopg2
import psycopg2
import psycopg2.extras # for execute_batch
conn = psycopg2.connect(
host="localhost",
database="pytestdb",
user="postgres",
password="[your db password]",
)
conn.autocommit = True #set autocommit=True so every command we execute will take effect immediately
def create_db_table(cur):
""" Connect to the local PostgreSQL database server """
try:
# connect to the PostgreSQL server
print('Connecting to the PostgreSQL database...')
# create a cursor
cur = conn.cursor()
# execute a SQL statement to get the PostgreSQL version
cur.execute('SELECT version()')
# display the PostgreSQL database server version
db_version = cur.fetchone()
print(db_version)
# Create a new table 'test_a'
cur.execute("""
DROP TABLE IF EXISTS test_a;
CREATE UNLOGGED TABLE test_a (
id INTEGER,
description TEXT,
first_brewed DATE,
num_float DECIMAL,
num_int INTEGER
);
""")
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
print('Created db table staging_bears and closed the db connection')
def add_data_to_table(cur, list_data):
# Psycopg2 provides a way to insert many rows at once using .execute_batch (FYI much faster than .executebatch).
#print('type(list_data) {}'.format(type(list_data)))
print('Inserting {} records into the db table test_a'.format(len(list_data)))
try:
# connect to the PostgreSQL server
print('Connecting to the PostgreSQL database...')
psycopg2.extras.execute_batch(cur, """
INSERT INTO test_a VALUES (
%(id)s,
%(description)s,
%(first_brewed)s,
%(num_float)s,
%(num_int)s
);
""", list_data)
# execute a SQL statement to get the number of records in table test_a
cur.execute('select count(*) from test_a;')
rows = cur.fetchone()
print('{} rows added to table test_a'.format(rows))
except (Exception, psycopg2.DatabaseError) as error:
print('SQL ERROR: {}'.format(error))
finally:
print('add_data_to_table() done')
from datetime import datetime
if __name__ == '__main__':
# Create table test_a
with conn.cursor() as cur:
create_db_table(cur)
# create some data as a list of dictionaries.
# NOTE: The key of the dictionary must match the table field (column) names.
dic = { 'id': 1,
'description': 'my first description',
'first_brewed': datetime.now(),
'num_float': 1.2345,
'num_int': 32767
}
li = []
li.append(dic)
dic = { 'id': 2,
'description': 'my second description',
'first_brewed': datetime.now(),
'num_float': 5432.1,
'num_int': -32767
}
li.append(dic)
print(li)
add_data_to_table(cur, li)
if conn is not None:
conn.close()
BigQuery is Google's fully managed, serverless data warehouse that enables scalable analysis over petabytes of data. It is a Platform as a Service (PaaS) that supports querying using ANSI SQL. It also has built-in machine learning capabilities. WBigQuery stores data in a columnar format .
* Automate Your BigQuery Schema Definitions With 5 Lines of Python
Marie Truong published a tutorial on Medium that demonstrates using BigQuery with Python to load two tables from API endpoints and then join the two tables.
BigQuery 101: All the Basics You Need to Know
A Guide to Arrays and Structs in BigQuery
A Crash Course in Google BigQuery
Python Solutions
Sitemap | Copyright © 2017 - 2024 Mechatronic Solutions LLC
Web site by www.MechatronicSolutionsLLC.com | | 37.2370 ms