Glossary#

All terms are alphabetically organized.

Anaconda#

an open-source distribution of Python (and R) that comes with many useful packages for data science, including conda

Anaconda Prompt#

the command line interface for conda on Windows machines

annotation#

placement of the specific label(bodypart) on an image

base environment#

the default Python installation that comes with Anaconda or Miniconda and includes core Python packages; a good rule of thumb is to never install new packages into the base environment to avoid corrupting it – use virtual environment instead

batch size#

number of images processed in one iteration of training (max value constrained by GPU memory). More precise term is mini-batch

benchmarking#

the practice of objectively comparing machine learning tools to identify the best-performing ones

bodypart#

also called label, in DeepLabCut is a arbitrarily chosen part of the animal that the user wants to track

Bonsai#

a visual reactive programming language that can be used to generate complex experimental workflows. Its Bonsai.Deeplabcut package uses DeepLabCut-live for real-time markerless pose estimation.

Colab#

Google Colab is a web based notebook used for writing and executing python code

conda#

an open-source system for managing packages and environments; comes with Anaconda and Miniconda

CPU#

central processing unit, also known as the processor, - a key component of any computing device

cropping#

cutting out part of the image, used for reducing computational expense

CUDA#

a parallel computing platform developed by NVIDIA for utilisation of NVIDIA GPUs

environment#

see virtual environment and base environment

dataset#

collection of annotations linked to specific images

detection#

placement of the label by the model

GPU#

graphics processing unit, also known as a “graphics card”, capable of high-throughput parallel processing; it greatly accelerates deep learning compared to a CPU

ground truth#

coordinates of the label in the immage annotated by the user

identity#

in multianimal DLC a parameter used for specyfing that annotations are subject specific (user can tell the difference between individuals)

IID#

Independent and Identically Distributed, term used for variables that have same probability distribution but are independent from each other (a coin toss always has a 50% chance to be heads or tails, no matter what the previous result of a coin toss was)

inference#

applying a trained model on data i.e. analysis

iteration#

one pass of the batch through the network

jitter#

natural tendency of inferred data to slightly move between frames of analyzed video. Stems from inference happening on image by image basis

MAE#

mean average Euclidean error – a metric that quantifies the Euclidean distance (what we intuitively understand by the word ‘distance’) between two observations, such as the manually added and predicted bodypart labels in DLC; proportional to RMSE

Markdown#

a lightweight markup language for creating formatted text using a plain-text editor

Miniconda#

a lightweight version of Anaconda that includes only conda and Python, albeit with fewer pre-installed packages; useful if storage space is of concern, but greater familiarity with the command line might be required

MoSeq#

developed in Datta’s Lab, an unsupervised machine learning method used to parse mouse behavior

OOD#

Out-of-Domain, a term used to define data that was not used in training the model (a different dataset)

outlier#

a frame in which model made bad detections

package#

a specifically organised collection of Python modules (simply put: Python code) that achieve a common goal; examples include DeepLabCut and TensorFlow

path#

a string of characters that uniquely defines a file or folder location on a computer, e.g. C:\Users\username\Downloads

project#

the folder structure and all its contents made during project creation and later work

refinement#

step of the workflow used for correction of bad detections on a subset of outlier frames

RMSE #

root-mean-square error, measure of difference between values predicted by the model and ground truth

SimBA#

developed in Golden’s Lab, a framework for training a supervised behavior annotation model

shuffle#

in DeepLabCut: a particular instantiation of train and test sets; multiple shuffles are used for model benchmarking

snapshot#

current state of the model with specific weights learned during training

supervised#

a model trained using labeled data with the goal of predicting the labels on unseen data.

terminal#

the command line interface for conda on MacOS/Linux machines

training#

process in which the model is learning to find weights that will allow it to solve assigned task (tracking bodyparts)

tracklet#

a fragment of a trajectory, relevant in multi-animal tracking. Tracklets are represented as nodes of a graph, whose edges encode the likelihood that a connected pair of tracklets belongs to the same trajectory

unsupervised#

a model trained without using human annotation, only patterns from the data

VAME#

developed by Kevin Luxem and Pavol Bauer, a framework for unsupervised behavior clustering

virtual environment#

a self-contained Python installation that lets users have different versions of the same Python packages on a single machine; great for project management and reproducibility

when you open the terminal (MacOS/Linux) or Anaconda Prompt (Windows), the environment you are currently in is displayed in brackets, e.g. (env) user@MacBook-Pro ~ % or (env) C:\WINDOWS\system32>

weights#

parameters of a neural network used to process the input (images for DLC)