From ashmansk@hep.uchicago.edu Fri Mar 16 23:50:56 2001 Date: Fri, 8 Dec 2000 00:38:47 -0600 (CST) From: Bill Ashmanskas To: Marjorie Shapiro Cc: Jim Amundson , wolbers@fnal.gov, watts@physics.rutgers.ed, r.stdenis@physics.gla.ac.uk, lmark@cdfsga.fnal.gov, stefano.belforte@ts.infn.it, rharris@fnal.gov, ksmcf@fnal.gov, sexton@fnal.gov Subject: Re: Offsite Database Export Review I have only begun to do my homework for next Tuesday's meeting, but I I'd like to point out (1) that I am unable to print past page 5 on typical CDF printers (e.g. b0tr146h_hp8000) CDF note 5352 and (2) that the link to Mark's slides from http://www-cdf.fnal.gov/internal/upgrades/computing /database/minutes/minutes_000328.html seems to be broken. While I'm sending mail around, I thought I'd mention a few of the issues that I currently am guessing will be addressed in the review. Of course I may learn something from the official committee charge or from the reading material that causes me to discard this list, but anyway, here it is. The point is not to ask for specific answers to each of these questions, but rather to tell you what kinds of things are on my mind. * What is the ($) cost per machine or per institution (e.g. licensing)? * What computing resources will each institution need to allocate? * How much effort will be needed by local experts or system administrators to keep each system running? What about Fermilab experts/administrators? * Will either solution make unusual demands on file systems, e.g. tens of thousands of 100-byte files? * Will there be periodic updates or maintenance, and if so, what is the computing hardware or computing time burden imposed (e.g. N hours of CPU and M gigabytes of scratch space on such and such workstation, daily)? * How quickly (in real time) will one be able to read the beamline (four real numbers, perhaps with covariance matrix) for 1000 runs, in each case, in an offline analysis program that does nothing but this (and is sparse in run numbers, as the W samples used to be)? * Same question, but somewhat more challenging, e.g. constants needed to re-run full COT tracking. * Same question, but even more challenging, e.g. constants required to re-run SVX clustering. * Will there be a performance difference between first access and subsequent access to the data for a given (apologies for old terminology) component/attribute/run-number? * Is this choice expected to affect compile/link time, executable image size, or virtual memory usage for remote applications that make DB accesses? Or is it some single process running behind the scenes, accessed through some common network-like API? * How robust is each solution if the network connection to Fermilab is (a) up but extremely slow or (b) down? -Bill