Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

NumPy

Open In Colab

Array Creation

Generating numpy arrays from lists

import numpy as np

a=np.array([1,2,3,4,5,6,7,8,9])
a
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
type(a)
numpy.ndarray
a.shape
(9,)
  • Just like lists we can change elements via assignment

a[0]  = 10 
a[-1] = 20                
a                

Lists of lists create 2D arrays!

b = np.array([[1,2,3],
              [4,5,6]])   

b
array([[1, 2, 3], [4, 5, 6]])
b.shape
(2, 3)

Array vs list: which is faster?

We can use %timit to compare speeds of elementwise operations done with lits vs numpy

%timeit [x**2 for x in range(10000)]
543 μs ± 43.7 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
x = np.arange(10000)

%timeit x**2
2.74 μs ± 7.12 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
  • Notice that numpy carried out squaring on every single element at once instead of requiring manual iteration.

  • With 10,000 integers, the Python list and for loop takes an average of single milliseconds, while the NumPy array completes the same operation in tens of microseconds. This is a speed increase of over 100x by using the NumPy array (1 millisecond = 1000 microseconds).

  • For larger lists of numbers, the speed increase using NumPy is considerable.

Elementwise operations with numpy

  • Basic mathematical functions operate elementwise on arrays!

  • Example: np.sqrt(x) or x**0.5 will take square root of every single element on numpy array x

x = np.array([1,2,3,4])
y = np.array([5,6,7,8])
x+10
array([11, 12, 13, 14])
x + y
array([ 6, 8, 10, 12])
x * y
array([ 5, 12, 21, 32])
y ** 2
array([25, 36, 49, 64])
  • The addition example shows that one can also do operations on arrays with unequal shapes!

  • In mathematics you can’t add vector to a scalar but in numpy you can!

  • These are powerful operations are called broadcasting. See the end for these rules and examples

Dot product and linear algebra

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v@w) 
print(np.dot(v, w)) 
219
219
np.linalg.norm(v) # length of vector
np.float64(13.45362404707371)

Generating arrays using special methods

  • Creating arrays of ones or zeros can also be useful as placeholder arrays, in cases where we do not want to use the initial values for computations but want to fill it with other values right away.

  • For instance np.zeros, np.ones, np.empty create such placeholder arrays.

  • np.random contains many functions for generating random numbers. We will utilize those to build simulations

  • There are large set of methods for generating arrays for common numeric tasts. Below are listed a few we will use most etensively

FunctionDescription
np.array([list, of, numbers])Array from a list
np.arange(start, stop, step)Array with know step
np.linspace(start, stop, num)Creates an array from [start, stop] with num number of steps
np.zeros((rows, cols))Array of zeros
np.ones((rows, cols))Array of ones
np.meshgrid(array1, array2)Two 2D arrays from two 1D arrays
np.rand()Generates random floats in the range [0,1) in an even distribution
np.randn()Generates random floats in a normal distribution centered around zero
np.zeros(3)  # Create an array of all zeros
array([0., 0., 0.])
np.ones(11)
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
np.random.randn(5)
array([ 0.32392278, -0.5874846 , 1.3435261 , -0.4192224 , -0.9937073 ])

Genearting N-dimensional arrays

  • Generating 1D arrays is done by specifying length N np.zeros(N)

  • Generating 2D arrays is done by specifying Nrows and M columns np.zeros((N, M))

  • Gemerating 3D arrays is done by specigyin

np.ones((1,5))   
array([[1., 1., 1., 1., 1.]])
np.random.random((4,4)) # Create an array filled with random values
array([[0.45156082, 0.61268901, 0.79056119, 0.47624643], [0.04668482, 0.81204034, 0.480528 , 0.13575776], [0.28217634, 0.01735394, 0.3371391 , 0.13085063], [0.31011152, 0.48744166, 0.22137827, 0.10761608]])
np.random.random((4, 5, 3)) # Create an array filled with random values
array([[[0.61671419, 0.94909685, 0.27718202], [0.25404466, 0.30568222, 0.27963052], [0.97431296, 0.89849359, 0.39384796], [0.80125058, 0.1429913 , 0.83039379], [0.5548475 , 0.16923327, 0.63661379]], [[0.17192773, 0.82167212, 0.09975055], [0.0431958 , 0.31604734, 0.09352455], [0.60371103, 0.43386876, 0.19077065], [0.16772129, 0.29886172, 0.75222776], [0.38475044, 0.78149338, 0.2887919 ]], [[0.09550091, 0.86804952, 0.28646837], [0.02503076, 0.5965902 , 0.44334156], [0.73910104, 0.36164312, 0.54137699], [0.93278862, 0.86000666, 0.52780666], [0.81323132, 0.02077547, 0.22377493]], [[0.65032197, 0.04968008, 0.0123995 ], [0.55735747, 0.21455138, 0.88163405], [0.52972524, 0.54116418, 0.55497725], [0.89796869, 0.47131812, 0.05272567], [0.00451078, 0.69548809, 0.40511867]]])

Indexing, slicing and shaping arrays

  • Slicing: Similar to Python lists, numpy arrays can be sliced. Since arrays may be multidimensional, you must specify a slice for each dimension of the array:

Quick example: Create and Slice the data to get the elements shown

a = np.array([[1,2,3,4], 
            [5,6,7,8], 
            [9,10,11,12]])
a.shape
(3, 4)

Predict the sliced elements

a[1,:4]  #
array([5, 6, 7, 8])
a[1,3]
np.int64(8)
a[:,-1] 
array([ 4, 8, 12])
a[-1,:] 
array([ 9, 10, 11, 12])

Boolean masks

Filtering values

We can also use Boolean masks for indexing -- that is, arrays of True and False values. Consider the following example, where we return all values in the array that are greater than 3:

arr = np.array([1, 2, 3, 4, 5, 6, 7])
mask = arr > 3
mask
array([False, False, False, True, True, True, True])
arr[mask]
array([4, 5, 6, 7])

Or you can use the boolean mask directly on array resulting in extremely compact and powerful notation

arr[ary>3]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[29], line 1
----> 1 arr[ary>3]

NameError: name 'ary' is not defined

Modifying Elements in place

arr = np.array([10, 20, 30, 40, 50])

arr[arr > 25] = 99
print(arr)  

Using Multiple Conditions

Boolean masks support logical operations (&, |, ~) standing for and, or and not

arr = np.array([1, 2, 3, 4, 5, 6])
mask = (arr > 2) & (arr < 5)  # Values greater than 2 AND less than 5

print(arr[mask])  
print(arr[~mask])  

Finding Index Positions

  • np.where() returns indices where the condition is True.

  • In the example below, we assign a 1 to all values in the array that are greater than 2 -- and 0, otherwise:

indices = np.where(arr > 3)
print(indices) 
np.where(ary > 3, 1, 0)

Aggregation

Numpy provides many useful functions for performing computations on arrays; one of the most useful is sum:

x = np.array([[1,2],[3,4]])
np.sum(x,axis=1)
print(np.sum(x))   # Compute sum of all elements; prints "10"
print(np.sum(x, axis=0))  # Compute sum of each column; prints "[4 6]"
print(np.sum(x, axis=1))   # Compute sum of each row; prints "[3 7]"
print(x.max())
print(x.min())

Reshaping arrays

  • In practice, we often run into situations where existing arrays do not have the right shape to perform certain computations. A

  • Remember once created that the size of NumPy arrays is fixed

  • Fortunately, this does not mean that we have to create new arrays and copy values from the old array to the new one if we want arrays of different shapes -- the size is fixed, but the shape is not. NumPy provides a reshape methods that allow us to obtain a view of an array with a different shape.

x=np.array([1,2,3,4,5,6,7,8,9,10])
x=x.reshape(2,5)
x
x=x.reshape(5,2)
x
x=x.reshape(-1,2) # when you put -1 it will autmaotaically infer the number
x
arr = np.arange(12)  # Array with values 0 to 11
reshaped = arr.reshape(2, 2, 3)  # 2 blocks, 2 rows, 3 columns

Broadcasting rules of numpy arrays

Broadcasting is a powerful feature in NumPy that enables arithmetic operations between arrays of different shapes. It allows a smaller array to be automatically expanded to match the shape of a larger array without explicit replication, making computations more efficient.


  1. Dimension Padding: If the two arrays have a different number of dimensions, the shape of the smaller array is padded with ones on the left (i.e., leading dimensions) to match the larger array.

  2. Dimension Expansion: If the arrays have mismatched shapes in any dimension, the one with a size of 1 in that dimension is stretched to match the corresponding size of the other array.

  3. Compatibility Check: If the shapes are incompatible—meaning that in any dimension the sizes differ and neither is 1—a broadcasting error occurs.

data     = np.array([[1,2],[3,4],[5,6]])
ones_row = np.array([1,1])
print(data.shape, ones_row.shape)
print(data+ones_row)

np.newaxis

  • np.newaxis increases the dimensionality of an array by adding a new axis, allowing for reshaping and enabling broadcasting in operations.

A = np.array([1, 2, 3])  # Shape (3,)
print(A.shape)

A_new = A[:, np.newaxis]  # Shape (3,1)

print(A_new.shape)
# Define two 1D arrays
A = np.array([1, 2, 3])  # Shape (3,)
B = np.array([10, 20, 30])  # Shape (3,)

# Reshape A to a column vector using np.newaxis
A_col = A[:, np.newaxis]  # Shape (3,1)

# Perform broadcasting: adding a column vector (3,1) and a row vector (3,)
result = A_col + B  # Shape (3,3)

# Print results
print("Array A reshaped as a column:\n", A_col)
print("Array B as a row:\n", B)
print("Result of broadcasting A_col + B:\n", result)
  • A[:, np.newaxis] reshapes A from (3,) to (3,1), making it a column vector.

  • Broadcasting allows us to add A_col (3×1) to B (1×3), producing a 3×3 matrix.

  • This demonstrates how np.newaxis enables broadcasting for element-wise operations in different dimensions.

Problems

1. Predict and explain the following statements

  1. Create an array of the numbers 1, 5, 19, 30

  2. Create an array of the numbers -3, 15,0.001, 6.02e23

  3. Create an array of integers between -10 and 10

  4. Create an array of 10 equally spaced angles between 0 and 2π2\pi

  5. Create an array of logarithmically spaced numbers between 1 and 1 million. Hint: remember to pass exponents to the np.logspace() function.

  6. Create an array of 20 random integers between 1 and 10

  7. Create an array of 30 random numbers with a normal distribution

  8. Predict the outcome of the following operation between two NumPy arrays. Test your your prediction.

    [1122]+[1]=?\left[ \begin{array}{cc} 1 & 1 \\ 2 & 2 \end{array} \right] + \left[1 \right] = \,\, ?
  9. Predict the outcome of the following operation between two NumPy arrays. Test your your prediction.

    [189819181]+[1111]=?\left[ \begin{array}{ccc} 1 & 8 & 9 \\ 8 & 1 & 9 \\ 1 & 8 & 1 \end{array} \right] + \left[ \begin{array}{cc} 1 & 1 \\ 1 & 1 \end{array} \right] = \,\, ?
  10. Predict the outcome of the following operation between two NumPy arrays. Test your your prediction.

    [1832]+[1111]=?\left[ \begin{array}{cc} 1 & 8 \\ 3 & 2 \end{array} \right] + \left[ \begin{array}{cc} 1 & 1 \\ 1 & 1 \end{array} \right] = \,\, ?

2. Array Manipulation

  1. Create an array B that contains integers 0 to 24 (including 24) in one row. Then reshape B into a 5 row by 5 column array

  2. Extract the 2nd row from B. Store it as a one column array called x.

  3. Store the number of elements in array x in a new variable called y.

  4. Extract the last column of B and store it in an array called z.

  5. Store a transposed version of B in an array called t.

3. Arrray slicing

  1. The 1D NumPy array G is defined below. But your code should work with any 1D NumPy array filled with numeric values.

G = np.array([5, -4.7, 99, 50, 6, -1, 0, 50, -78, 27, 10])

  • Select all of the positive numbers in G and store them in x.

  • Select all the numbers in G between 0 and 30 and store them in y.

  • Select all of the numbers in G that are either less than -50 or greater than 50 and store them in z.

  1. Generate a one-dimensional array with the following code and index the 5th element of the array.

    arr = np.random.randint(0, high=10, size=10)
  2. Generate a two-dimensional array with the following code.

    arr2 = np.random.randint(0, high=10, size=15).reshape(5, 3)

    a. Index the second element of the third column.

    b. Slice the array to get the entire third row.

    c. Slice the array to access the entire first column.

    d. Slice the array to get the last two elements of the first row.

4. random numbers

  1. For the following randomly-generated array:

    arr = np.random.rand(20)

    a. Find the index of the largest values in the following array.

    b. Calculate the mean value of the array.

    c. Calculate the cumulative sum of the array.

    d. Sort the array.

  2. Generate a random array of values from -1 \rightarrow 1 (exclusive) and calculate its median value. Hint: start with an array of values 0 \rightarrow 1 (exclusive) and manipulate it.

  3. Generate a random array of integers from 0 \rightarrow 35 (inclusive) and then sort it.

  4. Hydrogen nuclei can have a spin of +1/2 and -1/2 and occur in approximately a 1:1 ratio. Simulate the number of +1/2 hydrogen nuclei in a molecule of six hydrogen atoms and plot the distribution. Hint: being that there are two possible outcomes, this can be simulated using a binomial distribution.