README.md 3.03 KB
Newer Older
Jonathan Juhl's avatar
Jonathan Juhl committed
1
2
3
4
The following steps for installing sortem can be done through conda or pip. The easiest way is to install the requirements through conda as the gpu libraries and other requirements are downloaded with the python packages. If you install the packages through pip the nvidia modules will not be installed with it. In addition installing it into a conda environment avoids breaking paths. 



Jonathan Juhl's avatar
Jonathan Juhl committed
5
**1) Create conda enviroment**:  conda create -n sortem python=3.7
Jonathan Juhl's avatar
Jonathan Juhl committed
6
7
8

**2) activate the environment**: conda activate sortem

Jonathan Juhl's avatar
Jonathan Juhl committed
9
**3) download the repository**:  git clone https://gitlab.au.dk/au482896/sortem.git 
Jonathan Juhl's avatar
Jonathan Juhl committed
10

Jonathan Juhl's avatar
Jonathan Juhl committed
11
**4) go into directory**: cd sortem
Jonathan Juhl's avatar
Jonathan Juhl committed
12

Jonathan Juhl's avatar
Jonathan Juhl committed
13
**5) install the requirements**: conda install --file requirements.txt
Jonathan Juhl's avatar
Jonathan Juhl committed
14

Jonathan Juhl's avatar
Jonathan Juhl committed
15
16
17
**6) install the GUI**: pip install appjar


Jonathan Juhl's avatar
Jonathan Juhl committed
18

Jonathan Juhl's avatar
Jonathan Juhl committed
19
20
21
22
23
24
The algorithm is works the following way:
1) The algorithm computes transformation invariant keypoints represented as a binary vector of [-1,1] , meaning the feature vector contains the same information for all projections  from the same molecule(https://ieeexplore.ieee.org/abstract/document/9169844).

2) The vector expresses key characteristics of the protein, each pixel of the computed 16 x 16 image is weighed by a value between [0,max]. 4 areas of the protein are is extracted used for training of the neural network, improving the classification. ()

3) Each protein component is represented as a binarized vector which is concatenated with the other part , partial and full image vectors, improving the overall accuracy(https://arxiv.org/pdf/1902.09941.pdf).
Jonathan Juhl's avatar
Jonathan Juhl committed
25
26
27


Running SortEM
Jonathan Juhl's avatar
Jonathan Juhl committed
28
    
Jonathan Juhl's avatar
Jonathan Juhl committed
29
30
31
32
33
34
35
36
37
    --num_gpus how many gpus to use( only tested on single gpu, can run on multi gpu)
    --gpu_list list of strings of specific gpus to use if not using slurm queue ,write: gpu:0 gpu:1 gpu:2 (does not work for multi gpu yet)
    --num_cpus integer, how many CPUs to use to preprocess the images optained from the mrc files.
    --float16 , write True to use half precision, works well on volta series and higher, increases training speed up to 2.5 times.
    --star  list of star files, can contain wild cards
    --ab The batch size to train with on a single gpu.
    --o The output director (defaults ./results)
    --mp The max particles to use pr training epoch.
    --epochs The number of epochs, such that the total number of training imabes are epochs*mp
Jonathan Juhl's avatar
Jonathan Juhl committed
38
    --tr Use pretrained model, this will skip step 1 and 2, and the optimization procedure in step 3 so everything is just        predicted. This can   predict image dater within 10 min for a huge dataset.
Jonathan Juhl's avatar
Jonathan Juhl committed
39
    --log If the star file contains classes you can track the training with actual human classification, from Relion / cryosparc (to test to see if its worth it)
Jonathan Juhl's avatar
Jonathan Juhl committed
40
    --num_classes How many parts of the protein to rfine you want when you want to compare pretraining (step 1) with the number of classes in star file. 
Jonathan Juhl's avatar
Jonathan Juhl committed
41
42


Jonathan Juhl's avatar
Jonathan Juhl committed
43
    - Finalize multi gpu support 
Jonathan Juhl's avatar
Jonathan Juhl committed
44
45
    - Finalize transfer learning support
    
Jonathan Juhl's avatar
Jonathan Juhl committed
46
Example of typical run , the star file is required, the --ab argument is the batch size. If the batch size is to big 
Jonathan Juhl's avatar
Jonathan Juhl committed
47

Jonathan Juhl's avatar
Jonathan Juhl committed
48
 python3 main.py --star /u/misser11/Sortinator/p28/*.star --ab 64  
Jonathan Juhl's avatar
Jonathan Juhl committed
49
50
51