Getting Started
- You can launch JupyterLab from the command line or from Anaconda Navigator.
- You can use a JupyterLab notebook to edit and run Python.
- Notebooks can include both code and markdown (text) cells.
Variables and Types
- Use variables to store values.
- Use
printto display values. - Format output with f-strings.
- Variables persist between cells.
- Variables must be created before they are used.
- Variables can be used in calculations.
- Use an index to get a single character from a string.
- Use a slice to get a portion of a string.
- Use the built-in function
lento find the length of a string. - Python is case-sensitive.
- Every object has a type.
- Use the built-in function
typeto find the type of an object. - Types control what operations can be done on objects.
- Variables only change value when something is assigned to them.
Lists
- A list stores many values in a single structure.
- Use an item’s index to fetch it from a list.
- Lists’ values can be replaced by assigning to them.
- Appending items to a list lengthens it.
- Use
delto remove items from a list entirely. - Lists may contain values of different types.
- Character strings can be indexed like lists.
- Character strings are immutable.
- Indexing beyond the end of the collection is an error.
Built-in Functions and Help
- Use comments to add documentation to programs.
- A function may take zero or more arguments.
- Commonly-used built-in functions include
max,min, andround. - Functions may only work for certain (combinations of) arguments.
- Functions may have default values for some arguments.
- Use the built-in function
helpto get help for a function. - Every function returns something.
Libraries & Pandas
- Most of the power of a programming language is in its libraries.
- A program must import a library module in order to use it.
- Use
helpto learn about the contents of a library module. - Import specific items from a library to shorten programs.
- Create an alias for a library when importing it to shorten programs.
For Loops
- A for loop executes commands once for each value in a collection.
- The first line of the
forloop must end with a colon, and the body must be indented. - Indentation is always meaningful in Python.
- A
forloop is made up of a collection, a loop variable, and a body. - Loop variables can be called anything (but it is strongly advised to have a meaningful name to the looping variable).
- The body of a loop can contain many statements.
- Use
rangeto iterate over a sequence of numbers. - The Accumulator pattern turns many values into one.
Looping Over Data Sets
- Use a
forloop to process files given a list of their names. - Use
glob.globto find sets of files whose names match a pattern. - Use
globandforto process batches of files. - Use a list “accumulator” to append a DataFrame to an empty list
[]. - The
.merge(),.join(), and.concat()methods can combine pandas DataFrames.
Using Pandas
- Use builtin methods
.sum(),.mean(),unique(), andnunique()to explore summary statistics on the rows and colums in your DataFrame. - Use
.groupby()to work with subsets of your dataset. - Sort pandas series with
.sort_values(). - Use
.loc()and.iloc()to pinpoint specific locations in Pandas DataFrames. - Save DataFrames to CSV and pickle files using
.to_csv()and.to_pickle().
Conditionals
- Use
ifstatements to control whether or not a block of code is executed. - Conditionals are often used inside loops.
- Use
elseto execute a block of code when anifcondition is not true. - Use
elifto specify additional tests. - Conditions are tested once, in order.
- Use
andandorto check against multiple value statements.
Writing Functions
- Break programs down into functions to make them easier to understand.
- Define a function using
defwith a name, parameters, and a block of code. - Defining a function does not run it.
- Arguments in call are matched to parameters in definition.
- Functions may return a result to their caller using
return.
Tidy Data with PandasTidy Data in Pandas
- In tidy data each variable forms a column, each observation forms a row, and each type of observational unit forms a table.
- Using pandas for data manipulation to reshape data is fundamental for preparing data for analysis.
Data Visualisation
- Explored the use of pandas for basic data manipulation, ensuring correct indexing with DatetimeIndex to enable time-series operations like resampling.
- Used pandas’ built-in plot() for initial visualizations and faced issues with overplotting, leading to adjustments like data filtering and resampling to simplify plots.
- Introduced Plotly for advanced interactive visualizations, enhancing user engagement through dynamic plots such as line graphs, area charts, and bar plots with capabilities like dropdown selections.
Wrap-Up
- Python supports a large community within and outwith research.
- Follow standard Python style (using PEP8) in your code.