The Trigger Supervisor of the ARGO-YBJ detector

Stefano Mastroianni
for the ARGO-YBJ Collaboration

I.N.F.N. Sezione di Napoli and Dipartimento di Fisica dell’Università di Napoli,
Via Cintia - 80126 Napoli, Italy
E-mail: mastroianni@na.infn.it

Abstract—ARGO-YBJ is a full coverage air shower detector under construction at the Yangbajing Laboratory (4300 m a.s.l., Tibet, People Republic of China). Its main fields of research are gamma ray astronomy and cosmic ray studies. The detector covers ~5800 m² with single layer Resistive Plate Counters (RPCs), surrounded by a partially instrumented guard ring. This paper describes in detail the ARGO-YBJ Trigger Supervisor, which provides the interface between the Data Acquisition and the Trigger System. It is a simple and robust control instrument that monitors continuously the dead time at different levels of the DAQ architectures. We present in this paper the results of the first pilot runs at the Yangbajing laboratory.

I. INTRODUCTION

The ARGO-YBJ experiment (Astrophysical Radiation with Ground-based Observatory at YangBaJing) studies a wide class of phenomena in cosmic rays and astroparticle physics [1]. The apparatus has been designed to observe the secondary particles of the atmospheric cascade. The energy spectra of the showers of interest are distributed between ~100 GeV and ~5 PeV.

The detector is presently under construction at the YangBa-jing High Altitude Laboratory, nearby Lhasa. It consists of a central carpet, 74 × 78 m² large, made of a single layer of Resistive Plate Counters (RPCs) and surrounded by a partially instrumented guard ring, for a total instrumented area of about 6700 m². The RPCs work in streamer mode and each chamber is read out by means of 80 pick-up strips. The strip signals are amplified and digitally shaped by a custom VLSI chip. The ARGO-YBJ detector is divided into 18480 basic elements, the detector provides the space-time pattern of the shower front.

About 2500 m² of RPCs has been already assembled and 30% of them has been also fully equipped with the final data acquisition and trigger electronics. The DAQ main features are block-oriented data transfer and resolution of about 1 ns. Also, the Pad signals are the input to the Trigger logic. The ARGO-YBJ trigger system has been presented elsewhere [3], here we would recall that the expected trigger rate is about 10 KHz with an average event size of 4 kB. The trigger signal is forwarded to the Local Stations where it acts as a common stop for all the TDCs. The local information made of the TDC’s output and the patterns of the fired strips are collected and transmitted to the DAQ system.

The DAQ adopts a two level concentration scheme. It implements an event-driven data collection by using two custom bus protocols. Each Level-1 (L1) crate contains up to 40 data buffer channels, one for each Cluster. A Level-1 read-out controller collects the front-end data via the L1bus [4]. Up to 8 L1 controllers can be chained in a vertical connection and acquired by a Level-2 (L2) controller. This vertical connection is implemented by a custom L2bus [5]. The DAQ main features are block-oriented data transfer and read-out cycles labelled by event number.

In Fig. 1 is shown a simplified model of the data acquisition system. In this context we are interested in the generation of the Busy signals, which originate the DAQ dead time. For our purpose, all the modules are drawn as a FIFO memory and the Busy hierarchycal growing up mechanism is depicted.

When the trigger signal arrives to the front-end, the data frames built in the Local Stations are pushed into the L1 data buffers at a rate of 20 MB/s. During this transfer, the Local Station continuously asserts a local Busy to prevent the generation of further triggers. Its width is proportional to the lenght of the data frame to be transferred, and it changes with the number of the fired Pads in the Cluster.

Each data buffer stores the Local Station’s frame in a FIFO memory, which is available to the L1 controller via the L1bus. The Almost-full FIFO flag is put in a logical-OR with the Local Station Busy to form a board-level Busy. Inside the L1 crate, the L1 controller receives the wired-OR of all the data buffers’ Busy signals (crate Busy). The L1 controller gathers the Local Stations’ frames temporarily stored in the data buffers, it builds a new frame indexed by the event number and writes it in a local FIFO memory available to the L2 controller via the L2bus. Also at this level, the Almost-full FIFO flag is put in a logical-OR with the crate-level Busy to form a L1 Busy signal. The L2 controller collects from the L1 controllers all the frames belonging to a given event number and appends them in the local FIFO, which is read out by a
CPU board via the VMEbus. The L2 controller’s Almost-full flag represents the L2 Busy signal. The Fig. 2 shows the Local Station, the L1 and the L2 Busy signals following a trigger pulse.

The Local Station Busy is driven by the front-end logic in order to inhibit the generation of triggers during the TDC readout. This is due to the internal architecture of the Local Station which does not allow pipelining a new acquisition with the read-out. The L1 Busy shape shows two main components: a replica of the Local Station Busy and the effects of the FIFO flags triggered by the Almost-full boundary conditions. The L1 Busy depends upon the difference between the data throughput in input to the data buffers’ FIFOs and the readout rate sustainable by L2 controller. On the other hand, the L2 Busy is dominated by the CPU read-out on the VMEbus. While the Local Station Busy is fully ruled by the hardware, the L2 heavily depends upon the software running on the CPU and the VMEbus block transfer performance. In this scheme, the L1 controllers decouple the L2 read-out from the front-end data traffic. As such, the L1 Busy originates with both
the front-end and the VMEbus traffic. These sources shape
the signal with a leading low level (the footprint of the Local
Stations’ Busy) followed by a multiple ringing (due to the
FIFO Almost-full boundary). The duty cycle analysis alone
does not allow us to identify and evaluate the impact of the two
sources. However, by measuring its toggle rate we can easily
check the effectiveness of the L1 vs. L2 decoupling. If the
L1 controllers’ FIFOs never cross the Almost-full boundary,
the L1 Busy is basically a replica of the Local Stations’
Busy signals, which are always asserted after a trigger. In this
condition, the L1 Busy toggle rate equals the trigger rate, the
L1 and L2 controllers run asynchronously with respect of each
other and the Local Stations are fully decoupled by the VME
CPU-driven read-out.

In the same fashion, the L2 Busy duty cycle and toggle rate
combined analysis allows us to optimize the software running
on the CPU in order to achieve the highest data throughput.

The L1 and L2 controllers make their Busy signals available
on the front panels. The logical-OR of all L1 and L2 Busy
signals originates the System Busy. Its duty cycle gives us the
Total Dead Time (TDT) of the data acquisition process.

The DAQ controllers also drive the Halt signal to flag an un-
recoverable error. They receive in input the Trigger pulse and
a special synchronization signal, called SyncF(ailure), which
is used to verify the correct alignment of the event number
in all the DAQ environment. In case of a synchronization
failure, the DAQ controllers assert both the SyncF(ailure) and
Halt outputs. The Trigger and the SyncF inputs are bundled
together with the 3 flow-control lines in a single front-panel
connector.

III. THE TRIGGER SUPERVISOR

The Trigger Supervisor (TS) has been specifically designed
to monitor all the L1 and L2 trigger and control lines. It
measures the Busy signals’ duty cycle and frequency and it
generates and distributes the Veto signal to the Trigger logic.
It is also in charge to distribute the Trigger and SyncF signals
to all the Local Stations and DAQ controllers and to measure the
trigger rate.

In order to be scalable, the TS is organized in a modular
structure of VME boards housed in a dedicated crate with a
custom backplane. The TS adopts a two-layer architecture,
as shown in Fig. 3. The slave boards handle the L1 and
L2 flow control signals (Busy, Halt and SyncF) of up to 4
DAQ controllers. The slaves receive from the master board
the Trigger and SyncR signals and distribute them to the DAQ
controllers. The master board can control up to 16 slaves.

The master and slave boards are implemented on a unique
hardware platform based on XILINX FPGAs, and they only
differ in the configuration files loaded. The common platform
is a VME double-height board with A32/D8,D16,D32 data
transfer capabilities and its layout is shown in Fig. 4. The TS
logic block diagram is shown in Fig. 5.

The slave front-end section is made of four identical ports,
each handling the 5-signal bus of a DAQ controller. The
front-end interface FPGA implements the Halt, SyncF and
Busy logic for all the four ports. It works like a logic-
analyzer continuously measuring the duty cycle and frequency
of each Busy input and writing the results in the on-board
memory bank. The acquisition time base and the memory bank
parameters are controlled and read-out through the VMEbus.
The memory contains a stripchart-like dump of the monitored
quantities and it is capable to store up to 20 hours of analysis.
This makes it possible to study the trend of each Busy signal
(duty cycle and frequency) during the entire run. In each port,
Busy, Halt and SyncF are put in OR and transmitted to the
master.

The master board receives in input the Trigger pulse from
the Trigger logic and distributes it to all the slaves on the
custom backplane. It also generates and controls the synchro-
nization cycle by issuing the SyncR pulse. The master unit
collects all the ORed Busy signals from the slaves, put them
in OR to generate the global Veto signal which is send to the
Trigger logic to inhibit the Trigger generation. This originates
the TDT of the apparatus. The master samples the Trigger rate
and the duty cycle and frequency of the TDT. Like the slaves,
the master writes all the measured quantities in the on-board
memory bank.
Fig. 4. The Trigger Supervisor board

Fig. 5. The Trigger Supervisor logic block diagram
IV. TEST RESULTS

During the first phase of the ARGO-YBJ data taking we have studied the DAQ data flow in order to detect bottlenecks and to optimize the overall performances. The amount of data from each Cluster strongly depends on the shower topology. The Local Stations send to the data buffers event frames whose size are roughly proportional to the number of the fired Pads.

In order to keep the TDT as low as possible it is essential to balance the load on the DAQ controllers and this needs a detailed analysis of every Busy signal in the system. The TS has been intensively used to fine tune the timing performance of the DAQ during the experimental runs without interfering in the data transfers and with a minimal software overhead.

In Fig. 6a is shown the frequency of a L1 Busy signal plotted versus the trigger rate. It can be seen how the L1 Busy toggle rate equals the trigger rate up to 20 kHz, signifying an effective decoupling between L1 and L2 controllers.

The TS allows us to breakdown the TDT in the L1 and L2 components. Fig. 6b shows that the L1 is dominant up to a TDT of 20% and then up to a TDT of 80% the L2 has an exponential grow. Eventually, the L2 becomes the only dead time source in the extreme region above 80%. The typical TDT measured in our set-up does not exceed 10%.

It should be noted that the TDT is the logical-OR of the L1 and L2 components, such that in general their sum does not equal the TDT. This only happens when the Busy signals do not overlap in time. The Fig. 6c shows that in our set-up this condition is met up to a TDT of 50%. This means that L1 and L2 activities are displaced in time and the two DAQ levels work in pipeline. This behaviour can be achieved when the FIFO buffers are Almost-empty and the data transfer load shifts gracefully from the L1 to the L2.

V. CONCLUSIONS

A 6×7 cluster detector slice has been assembled at Yangba-jing laboratory in December 2004. This slice is fully functional and it is instrumented with all the electronics, including the Trigger Supervisor equipped with 1 master and 2 slave boards.

The DAQ data flow has been characterized by using all the features implemented in the TS. The analysis shown in this paper is a routine duty during the physics and calibration runs. It allows us to understand and keep under control a complex DAQ system installed in a remote experimental site with a limited number of researchers.

It is presently under design a different TS implementation capable to trigger on complex Busy patterns, like a state-of-art logic analyzer. This will be used to track in real-time anomalous data transfers, which can be due to detector noise or faults in the DAQ equipment.

The TS has shown to be a viable tool to measure the decoupling between the L1 and L2 DAQ and to dimension consciously the FIFOs size to fit our experimental requirements.

REFERENCES