Changes

Andreas Gad · 8f7ad7dd
--- a/home.md
+++ b/home.md
@@ -2,9 +2,324 @@ This document will describe most aspects of the DAQ used for experiment I257.

 The guide is divided into the following sections:
 # Table of Contents
-+ [Hardware](#hardware)
-+ [Software](#software)
-+ [How to start the DAQ](#how-to-start-the-daq)
-+ [Analysis](#analysis)
+ [Hardware](####hardware)
+ [Software](####software)
+ [How to start the DAQ](####how-to-start-the-daq)
+ [Analysis](####analysis)

-# Hardware
+#### Hardware
+We are currently running a two-crate system. This means we have two crates, each equipped with one VME computer (__RIO4__) and one trigger module (__VULOM4b__). A desktop computer (__Polarbear__) acts the final DAQ-computer for the two VME computers, and all user interaction goes through this computer. Notice that both VME computers boot from __Polarbear__.
+
+One of the crates acts as the master. That means it works more or less as a single-crate configuration, but it also answers to the BUSY of the slave etc. This crate handles all the advanced triggering logic, and this is where we can input triggers from external sources (shapers, etc).
+
+The other crate acts as a slave. The slave's trigger configuration is extremely simple, it basically tells the master when it is busy, and issues GATE signals when the master tells it to.
+
+The VME computer in each crate handles the readout of the data from the modules. Since we have two crates, we use an event builder to combine the data from both computer to valid events. The event builder attaches to both the master and the slave and recieves the data from the readout software on each node. This event builder runs on __Polarbear__.
+
+![multicrate](uploads/de42d81e5f1c055647d2648a219a9609/multicrate.png)
+
+The VME computer in the master crate will be referred to as __RIO7__ (7 is due to it's IP address) and the VME computer in the slave crate is referred to as __RIO2__ (again, due to IP address).
+
+Both VME computers boot from the same NFS-filesystem on __Polarbear__ (via the ethernet connection), and one will therefore encounter the same files when accessing either VME computer.
+
+__Polarbear__  can be acessed with `acquser`. You can access it remotely with from any computer on a CERN network.
+```bash
+ssh acquser@polarbear
+
+# IF you are logged into sec@pcepisdaq6
+ssh pbear
+```
+
+One can access the VME computers from __Polarbear__ with `ssh`:
+```bash
+ssh rio2
+ssh rio7
+```
+
+Below is a picture of the rack with the crates as well as __Polarbear__:
+
+![crate_polarbear](uploads/195258a99fdf43199470e1b35a2e2e69/crate_polarbear.png)
+
+#### Software
+We use a rather broad palette of different programs and libraries to make everything work. In order to not have worry to much about the individual parts, we have written __DAQC__, which handles all the different parts and provides a user-friendly interface for control and monitoring. The following list is an extremely brief description of the different components. Most of the time, however, all you need to worry about is __DAQC__, which will be described in next section.
+
+  - `drasi`: The data acquisition (data pump) that handles the data transport aquired from our modules. It allows the use of a so called `f_user.c`, which is a piece of C code that handles the actual readout of the modules. The `f_user.c` is custom made. We run 3 drasi instances: one on each VME computer and an event builder on __Polarbear__.
+  - `nurdlib`: The VME library that knows the data layout of a wide variety of modules and how to do an actual data readout. We use this to perform the tasks of our `f_user.c`.
+  - `trlo II`: Firmware for the VOLUM-models. We use this to program our trigger logic. 
+  - `ucesb`: Data unpacking software. We use it mainly for two purposes. It is used to attach to the data stream from the event builder and save the data to disk. It is also used to attach to the data stream from the event builder and fan it out to multiple online analysis programs. A major advantage of `ucesb` is that it checks that the data is not corrupt online. 
+  - `go4`: Online analysis. We have interfaced this with `ucesb`, a project called `go4cesb`, which we use to run a `go4` server. We can then attach multiple `go4` GUI clients.
+  - `DAQC`: Data acquisition controller program, that manages all the different pieces of software that makes the DAQ run. 
+
+To stress it again - you will mainly need to worry about __DAQC__ and __Go4__ which runs on __Polarbear__.
+
+## Prerequisites
+
+In principle anyone can run the DAQ. We have tried to make it as user-friendly as possible. When everything is running, everybody who can open a browser, can also run the DAQ. However, if you need to start from scratch or something is not working properly, it is useful to know some basics about the Linux terminal and a few standard Linux tools. It is worth spending a little bit of time getting familiar with the following:
+
+  - Navigate in the terminal shell and run executables. Commands such as `cd`, `pwd`, `ls`, `cat`, `cp` are examples of very useful commands. 
+  - Use `ssh` to access a remote computer.
+  - Use `screen` sessions. Learn how to create, list, attach and detach. 
+  - Use `ping` to test network connections.
+  - Run `python` scripts. Also look into `virtualenv`'s.
+  - Use `scp`, `sftp` or `rsync` to copy files between computers
+  - Use a text editor to edit files. There are GUI options such as `gedit` on __Polarbear__, but it may be worthwhile to learn some basic usage of a terminal based editor such as `vi`, `vim`, `emacs` or `nano`. This is for instance our only option on the VME-computers. 
+
+A few resources:
+  - https://maker.pro/linux/tutorial/basic-linux-commands-for-beginners
+  - https://linuxacademy.com/blog/linux/ssh-and-scp-howto-tips-tricks/
+  - https://www.howtogeek.com/howto/42980/the-beginners-guide-to-nano-the-linux-command-line-text-editor/
+  - https://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/
+  - http://www.pythonforbeginners.com/basics/how-to-use-python-virtualenv/
+
+
+## DAQC
+As mentioned earlier, __DAQC__ manages all the different components. It is probably easiest to understand what it is, by going through the web interface. I have marked the essential components (i.e. the components that need to be alive in order to take data) with *.
+
+Generally speaking, the color codes are:
+  - Green: Everything is OK,
+  - Yellow: We are starting up the service,
+  - Red: Something is wrong,
+  - Grey: The service is disabled.
+
+In the __FAQ__ you can learn how to start __DAQC__. So let's assume you have started it up. __DAQC__ provides a pretty web frontend on port `8080`. This means that you can open a browser on any computer on the CERN network (including __Polarbear__ ) and type `polarbear:8080`.
+
+Notice, that even though you can access it from anywhere on the CERN network, you need the credentials to be able to change anything.
+
+The web frontend will show you something like the following:
+![daqc](uploads/8faaeea12352f74cf5de7118c169b597/daqc.png)
+
+  1. __VME computers*__ (__RIO2__, __RIO7__): This box shows you whether the VME computers are online, i.e. if they can be reached from __Polarbear__. 
+  2. __Readout*__: This box shows the status of the readout, i.e. the event builder on __Polarbear__ and the two `drasi` instances running on the two VME-computers. The `2/2` means that both nodes are running. The event builder broadcasts the data such that we can attach onto it and save it to a file.
+  3. __Relay__: The relay is a `ucesb` instance that attaches to the event builder's data stream and fans it out to several data streams. This also validates the data, i.e. if it crashes, the data may be corrupt. The relay provides data to, among others, the online analysis. So if the relay dies, so does the online analysis. The relay is *NOT* responsible for the data taking or recording, and is thus non-essential.
+  4. __Go4__: The online analysis. We run the online analysis as a server. That means that we do not NEED a GUI. Of course, to see anything, we need a GUI. However, by running the analysis as a server, we can attach multiple GUI instances. Moreover, if the GUI crashes, the online analysis process do not necessarily crash. In this way, we can also access Go4's web version in a browser. Open up a browser and type `polarbear:5000`, and you will see the online analysis (if it is running, of course).
+  Notice that __DAQC__ only know about the server, and not about any of the GUI instances. So if they die, simply reopen them.
+  5. __Sync__: We can automatically synchronize the recorded data to remote hosts, for instance a server at a local institute. You can see in the __FAQ__ how to add a new destination.
+  6. __Mesytec__: We have a bunch of Mesytec amplifiers and their settings are usually quite important for the experiment. Before the experiment starts, one have usually settled on some set of parameters. These parameters can be saved using the Mesytec Control. In order to be sure that these settings haven't changed during the experiment (power failure, manual tinkering, etc..) __DAQC__ can compare the current settings of the modules with some saved reference settings (typically the ones you would also load). If it turns red, it means that there are inconsistencies between these two sets of settings. Refer to the __FAQ__ section to see how you can manage this feature.
+  7. __Triggers__: The lower left section allows you to enable or disable trigger logic selections. These are generated from the inputs in the VOLUM in the master crate. You can change the total trigger configuration by pressing the edit button in the top right corner. You can now enable and disable the different trigger inputs. The resulting trigger will be an __OR__ of all the enabled triggers. You can also downscale a single trigger by powers of two in the input field. For instance, a 0 will take every trigger, whereas 3 will take every 2^3=8th trigger.
+  8. __File log__: (upper right) Show the recent files taken.
+  9. __File taking__: To start a file, simply click the big blue "Start" button. Whenever we start a file, we also log all the settings from the shaper modules, the configuration of the VME modules and the enabled triggers. When you press start, you are met with a run sheet. Please do fill this out. After the run, it is saved in `runs.log` in the data directory. Here you can retrieve all the information afterwards. When you are done, simply press "Stop", and do your final edits to the run sheet.
+  
+     __Note I__: When you press start, the file taking starts immediately - so take your time time to fill out the run sheet. When you press Stop the file taking also stops immediately - so take your time to fill out the run sheet.
+     __Note II__: The data acquisition continues to run after a file is closed.  Only file recording is stopped by pressing "Stop".
+     
+     When taking file, instead of saving one big fat file for each run, we chop it up into smaller pieces. The size of these pieces can be chosen in the right bottom corner. This results in a data structure like this
+     ```
+     acquser@polarbear:~/<experiment>/data/raw/17_000.lmd.gz
+     acquser@polarbear:~/<experiment>/data/raw/17_001.lmd.gz
+     acquser@polarbear:~/<experiment>/data/raw/17_002.lmd.gz
+     acquser@polarbear:~/<experiment>/data/raw/17.log
+     acquser@polarbear:~/<experiment>/data/runs.log
+     ```
+
+## Go4
+To see the online analysis, you can open the dedicated GUI program GO4 on __Polarbear__, by just typing `go4` in a terminal. If you want to watch the online analysis from a remote computer on the CERN network, you can open a browser and type `polarbear:5000`. This will open a web version of the online analysis.
+
+The GUI looks like this
+
+![Screenshot_from_2018-07-26_12-29-18](uploads/d28cc9251b82f71147872ca6a6f7727a/Screenshot_from_2018-07-26_12-29-18.png)
+
+To see the different histograms, double-click `localhost:5000` in the left side, and then double-click the `Histograms` tab. 
+
+Make sure that the three green arrows in the top left (second toolbar) is pressed. This makes sure that we are updating the plots.
+
+If nothing happens, follow the flowchart in the troubleshooting section. 
+
+Go4 in accessible only inside the CERN network. That means anyone inside the CERN network can access the Go4 and do start/stop monitoring as well as clearing the histograms.
+
+
+### Editing the online analysis
+The online analysis is a custom made integration of the GSI project [Go4](https://www.gsi.de/en/work/research/experiment_electronics/data_processing/data_analysis/the_go4_home_page.htm?C=0), [ausalib](https://git.kern.phys.au.dk/ausa/ausalib/wikis/home) and [ucesb](http://fy.chalmers.se/~f96hajo/ucesb/). We call the project [go4cesb](https://git.kern.phys.au.dk/ausa/go4cesb/wikis/home). 
+
+This project enables us to relatively easy write custom online analysis as well as using some prewritten analysis. See [this guide](https://git.kern.phys.au.dk/ausa/go4cesb/wikis/custom-analysis) for writing custom online analysis. Notice that `parameters.json` and `AusaUserAnalysis.h/cxx` are symbolic linked to the files in `~/<experiment number>/go4/`.
+
+The prewritten code makes it easy to include calibrations. Simply create the calibrations according to the `ausalib` [setup format](https://git.kern.phys.au.dk/ausa/ausalib/wikis/Setup). You will find the setup (and [`matcher.json`](https://git.kern.phys.au.dk/ausa/ausalib/wikis/JSON-Matcher) in `~/<experiment name>/` folder.
+
+## Grafana
+
+[Grafana](https://grafana.com/) is a very nice graphing tool we use for time series data. We can send all kinds of information up with and display it very easy. Currently we send up information such as the VULOM scalers, the `drasi` stats and some more.
+
+Grafana is hosted as a CERN webservice and can be accessed at
+
+https://sec-monitor.web.cern.ch/
+
+__NOTICE__: To login we use CERN's login system. The user is required to be subscribed to the `isolde` e-group, which can be done [here](https://e-groups.cern.ch/).
+
+There are a few dashboards of interest. For the general setup performance, one can look at the `Triggers` and `Dead time` which shows the rate and dead time for each trigger. We will make purpose built dashboard(s) for each experiment, once we know which triggers goes where etc. 
+
+To push data to Grafana, open a terminal and issue (__DAQC__ must be running)
+```bash
+monitor
+```
+
+## Startup guide
+This section will give you the step-by-step guide to how to start up the acqusition from scratch. It is usually only required if one of the VME computers lost power or was rebooted. Otherwise a simple restart of __DAQC__ should be enough.
+
+ 1. __Turn on the VME-crates and the NIM-crates__: Make sure that all crates have power and are running. Wait a minute or two, to make sure both VME computers are online.
+ 2. __Reconfigure the trigger-module__: We need to reconfigure the `VULOM`. We can do this with the `rewire.sh` script on each VME computer. Do the following: 
+    
+    ```bash
+    ssh rio2
+    cd daq/f_user
+    ./rewire.sh
+    ssh rio7
+    cd daq/f_user
+    ./rewire.sh
+    ```
+ 3. __Start the mux-server__:
+       This is required for __DAQC__ to be able to change the triggers.
+
+    ```bash
+    ssh rio7
+    su
+    service mux-srv start
+    ```
+ 5. __Start up DAQC__: Next we start up the acquisition:
+    ```bash
+    daqc
+    ```
+    Notice that `daqc` is an alias. You can read some more details about what this actually does in the __FAQ__. We do, however, not need to worry about that now.
+
+    It will take some time to start up. The sequence will end with something like
+    ```
+    2018-06-14 11:02:37,833 - [rio7] Waiting for drasi 12
+    2018-06-14 11:02:38,835 - [rio7] Waiting for drasi 13
+    2018-06-14 11:02:39,837 - [rio7] Waiting for drasi 14
+    2018-06-14 11:02:40,839 - [rio7] Waiting for drasi 15
+    2018-06-14 11:02:41,841 - [rio7] Waiting for drasi 16
+    2018-06-14 11:02:42,843 - [rio7] Waiting for drasi 17
+    2018-06-14 11:02:43,845 - [rio7] Started Drasi
+    2018-06-14 11:02:43,846 - drasi_rio7 revived
+    ```
+    
+ 6. __Check that the data aqusition is running__: We can now go to the __DAQC__ frontend to monitor everything. Go to a browser and enter `polarbear:8080`. You should now see __DAQC__. Everything should be green. That means we are happy.
+
+    If everything does not turn green within a minute or so, try to restart __DAQC__. This may, in particular, be required if only the crates died, but __DAQC__ kept running on __Polarbear__.
+
+ 7. __Check that the shaper settings are loaded__: If the NIM-crates' power was off, we should also load the settings for these. This assumes you have saved the settings in `settings.json`. Consult an elog or someone who has been setting up the shapers, which file you should be using. Open a terminal and issue
+    ```bash
+    mesyctrl
+    ```
+    Check that we see all modules. Now use `l` to load the settings file.
+ 8. __Push data to Grafana__: Open a terminal and issue (__DAQC__ must be running)
+    ```bash
+    monitor
+    ```
+
+## FAQ
+
+  - __Start DAQC__:
+    Simply open a terminal and issue:
+    ```bash
+    daqc
+    ```
+    You are now inside a screen session called `daqc`, and __DAQC__ is starting up. If __DAQC__ was already running, we simply entered the screen session called `daqc`.
+ 
+    __What happend? (for the interested reader):__
+
+    We run `daqc` in a screen session. This makes it easy to manage a single `daqc` instance from any terminal. More over, we run `daqc` in a socalled **virtual environment** or `virtualenv` due to the various python dependencies. Therefore, when executing `daqc` we actually execute a script that makes sure that we are both inside a screen session and inside a `virtualenv`. The command `daqc` is thus just an alias that executes a script that does the following:
+     - Start a screen session called `daqc`.
+     - Go to the correct directory and source the `virtualenv`.
+     - Run the __DAQC__ command.
+    
+       To detach from the screen session: `Ctrl+A D`.
+
+  - __Stop or Restart DAQC__: 
+     Go to the screen session running __DAQC__ and do a `Ctrl + C`. Be patient, it will take some seconds to power down, do not keep hitting `Ctrl + C`. 
+     If you want to start it again, simply run 
+     ```bash
+     daqc
+     ```
+     If the readout does not restart properly (it takes >20 sec and just keep going, try to stop __DAQC__ and issue `die_drasi`, and then start __DAQC__
+    
+  - __See DAQC__: 
+   Go to a browser and open `polarbear:8080`
+
+  - __Start a file__:
+    Go to the browser with __DAQC__ (`polarbear:8080` in the URL). Hit the big blue button saying "Start".  __REMEMBER__ to fill in the run sheet information!!!
+
+  - __Stop a file__:
+    Go to the browser with __DAQC__ (`polarbear:8080` in the URL). Hit the big blue button saying "Stop".  __REMEMBER__ to fill in the run sheet information!!!
+
+  - __Find my data__:
+    The raw data is located in `~/<experiment name>/data/raw`. You can unpack it into a simple ROOT format with the `unpack.sh` script in the `data` directory. It will place the unpacked files in the `unpacked` directory
+    ```bash
+    ./unpackMBS.sh raw/15_*
+    ``` 
+  - __Start Go4__:
+    To start __Go4__, open any terminal and issue
+    ```bash
+    go4
+    ```
+    Enter the `polearbear:5000` tab on the left, and then the `Histograms` folder.
+    
+  - __See the VULOM scalers__:
+     You can get the VULOM scalers from both the slave and the master. However, the master node is most interesting. To see the scalers on the master:
+     ```bash
+     ssh rio7
+     scalers
+     ```
+     
+  - __Change the trigger__
+    Go to the browser with __DAQC__ (localhost:8080 in the URL). On the lower left you can see the active trigger. Click the edit icon, and press the name of triggers you want to toggle. Hit okay. You can also downscale the trigger by a power of 2.
+
+  - __Add synchronization destinations__: 
+    __DAQC__ can automatically copy the raw files and log files to a remote destination. In order to do this, you must do two things:
+    1. Make sure that you can `ssh` into the remote without any password, i.e. setup key-based authentication using the public key from __Polarbear__ (`~/.ssh/id_rsa.pub`). 
+    2. Go the `~/DAQC/settings.yml` and edit the `sync` part to include your remote(s). Say you have two remotes called `myserver1.mydomain.org` and `myserver2.mydomain.org` with users `myuser1` and `myuser2`. To sync the data to the directory called `/remotedata`, edit `settings.yml` to include
+
+        ```yaml
+        sync:
+            destinations:
+                - myuser1@myserver1.mydomain.org:/remotedata/
+                - myuser2@myserver2.mydomain.org:/remotedata/
+            ripe_age: 20
+            active_wait: 4
+            method: rsync -a --progress
+        ```
+    3. Restart __DAQC__
+
+
+  - __Start the Mesytec control__:
+    Find a terminal and issue
+     ```bash
+     mesyctrl
+     ```
+
+
+  -  __Save the Mesytec settings__:
+    Go to the MesytecControl. If you are inside a module, exit this with `x`. Now enter `s` and follow the commands.
+
+  -  __Save the Mesytec settings__:
+    Go to the MesytecControl. If you are inside a module, exit this with `x`. Now enter `l` and follow the commands.
+    
+  - __Add a Mesytec settings file to DAQC__:
+    In the __DAQC__ setttings file `settings.yml`, you will find a section called `mesytec`, with a tag called `file`. Change this to point to the location of the settings file you want to use for reference. Restart __DAQC__.
+    ```yaml
+    mesytec:
+       command: python /home/acquser/MesytecCtrl/main_app.py -j
+       file: /home/acquser/MesytecCtrl/settings.json
+       sleep: 120
+    ```
+
+  - __Push data to Grafana__:
+    To push data to Grafana, open a terminal and issue (__DAQC__ must be running)
+    ```bash
+    monitor
+    ```
+
+## Tools
+
+#### Channels
+If __DAQC__ is running, you can run 
+```bash
+channels
+```
+in any terminal to get an overview of the rates is all ADC's and TDC's. It looks something like this:
+![Screenshot_from_2018-08-02_11-49-25](uploads/347fe57a5502614d2be0f362905a0a06/Screenshot_from_2018-08-02_11-49-25.png)
+## Troubleshooting
+Since the DAQ consists of so many moving parts, a lot can go wrong. However, a lot of these moving parts are non-essential, i.e. data can still be taken. The readout itself is very, very stable, i.e the two __RIO__'s and the event builder on __Polarbear__. We have yet to see a proper crash of these.
+
+What is much more likely to cause problems, and probably will during every experiment is __Go4__. Therefore, this should be the first suspicion. 
+
+The following flowchart will hopefully catch most situations. If you are stuck in a loop (an nothing starts working) do call an expert!
+![troubleshooting](uploads/390163a4118c096b8cdeaff798537101/troubleshooting.png)