Setting up a scientific python environment
30 Jul 2018, Samuel Hinton
A super short guide to quickly setting up a viable python environment.
This writeup is primarily for the students of PHYS3080 at UQ, however it should be generally applicable. Firstly, let's get the obvious out of the way. It's 2018. Python 2.7 is dead. Long live Python 3!
Getting a Python Environment Up And Running
So let's get a nice python 3 environment set up. Don't google python 3, that would be crazy. Use Anaconda. So head here to download it, and pick the Python 3 version, 64 bits. The installation dialog should look something like this:
Install it just for you, put it in some convenient location, and register Anaconda as the default. If you want Anaconda to be everywhere and never worry about starting it yourself, also add it to your path. It has the potential to confuse your computer if you have other previous python installations, so keep that in mind. So install it, and yay, you're essentially done.
If you've downloaded
miniconda instead of Anaconda, you won't have a few
useful addons, like Spyder. If that's the case, open an anaconda prompt window
(it should be installed now) and type
pip install spyder. If you can't find
anaconda prompt, you can also do this with a terminal / command prompt window,
you just might need to navigate to your install directory's Scripts folders.
For me, this would be
If you need any dependencies or libraries in the future, this is the way
to get them.
pip install <name>, like
pip install numpy. If it doesn't work,
conda install <name>. Here's what it looks like for me when I install
chainconsumer. For your installations, there might be a lot more
text. Don't worry about it, unless it fails, it'll also be installing dependencies.
Right, so we should now be good to go. You can verify this by running Anaconda Prompt
python -V to get the version of installed python. You should see a
python 3 anaconda version pop up.
Writing Code in Python
Anaconda, at the moment, comes bundled with a handy piece of software called Spyder. It allows you to write code and execute it on an iPython console (which is a normal python console with extra fun features like being able to embed figures inside it).
So, open up Spyder. Hopefully it's installed as an application,
if not, you can launch it manually inside your anaconda installation. For
me to do this, I would run the executable at
On other systems, you might launch Spyder with a shell file or similar.
Just to verify you have the basic packages, let's do some plotting
numpy. I'm deliberately adding extra
options with the plotting so that you can see how to easily change
things like the colour, size, line width, line style, etc. Feel free
to delete all these options, the code will still work.
import numpy as np import matplotlib.pyplot as plt # Get some fake linear data x = np.linspace(0, 10, 1000) y = x + np.random.normal(size=1000) # Create a figure with one subplot. # Yes this can be simpler, but now its useful for the future. fig, ax = plt.subplots(ncols=1, nrows=1) # Plot our data points and line ax.scatter(x, y, alpha=0.3, c="b", lw=0, s=5, label="Data") ax.plot(x, x, color='k', ls="--", label="Model") ax.set_xlabel("x") ax.set_ylabel("y") ax.legend() # If you wanted to save it, uncomment # fig.savefig("example.png", pad_inches=0, bbox_inches="tight")
So thats Spyder. Other alternatives are Jupyter Notebooks, or - for a more heavyweight solution, PyCharm. Use whichever you want, though for astrphysics courses you probably don't need to spend much time worrying.
A final example on linking multiple files
No one likes huge files overflowing with a hundred functions and thousands of lines of code. So here is a quick example on how to call functions from other files.
Here we have three files all in the same directory. One called
load_data.py and one called
fit_data.py. The python in
each file respectively is:
import numpy as np def get_data(filename): return np.genfromtxt(filename, dtype=None, names=True)
import numpy as np import matplotlib.pyplot as plt from load_data import get_data def fit_data(x, y): m, c = np.polyfit(x, y, deg=1) return m, c def plot(x, y, m, c): fig, ax = plt.subplots(ncols=1, nrows=1) ax.scatter(x, y, c="b", lw=0, s=10, label="Data") xs = np.linspace(np.min(x), np.max(x)) ax.plot(xs, m * xs + c, color='k', ls="--", label="Model") ax.set_xlabel("x") ax.set_ylabel("y") ax.legend() plt.show() if __name__ == "__main__": filename = "data.txt" data = get_data(filename) x, y = data["time"], data["velocity"] m, c = fit_data(x, y) print("Best fit has gradient %0.2f and offset %0.2f" % (m, c)) plot(x, y, m, c)
You can see that to import a function from
load_data.py all we do inside
fit_data.py is write
from <filename> import <function>.
I've also included some basic code useful for loading data files with columns which
might come in handy. Also, note that now we have multiple files it is
useful to break what you are doing down into functions, and have a
function, which in python is defined by
if __name__ == "__main__":. This is
useful because that
if statement only gets executed if you run that file.
Code not in that if statement will be executed if you try and import it.
You can try this - add a
print("hello") command to
load_data.py without the
if statement and you'll see it print out when you import it.
If I've left something out, please let me know, but hopefully you're now good to go with a light-weight scientific python environment.
Connect to stay in the loop for tutorials and posts.
Astrophysicist & Data Scientist