The demands on models for performance and reliability analysis are constantly increasing, since the increasing complexity of real systems means that experiments on the system are not feasible or would require an unacceptable amount of effort. At the same time, however, the demands on the quality of results to be achieved are also increasing, so that real processes and thus also correlations must be modeled with sufficient accuracy.
The approaches available so far for adapting the parameters of a MAP to real traces still have some deficits for practical use. These deficits are to be eliminated or at least reduced within the scope of the work priorities described in the following. The goal is to perform the parameter adjustment of MAPs as efficiently and robustly as is possible with the methods for phase distributions developed in recent years.
In the second phase of the project, the extension of the developed methods for parameter fitting of MAPs to RAPs is in the foreground, in order to extend the class of models that can be analyzed with numerical methods beyond Markov processes, without having to introduce fundamentally new analysis techniques.
There are two basic approaches to modeling real processes on the basis of traces. The entire trace can be used for adaptation or measured values can be extracted from the trace, which are then reproduced as accurately as possible by a MAP. When using the full trace, all available information is used. By maximizing the likelihood function in EM algorithms, algorithms are also available that use this information. Practically, however, such an approach is not feasible, since today's traces from computer networks must contain several million entries in order to really adequately represent the relevant characteristics. An EM algorithm that takes the entire trace into account must calculate the value of the likelihood function over all entries in each iteration. Besides numerical problems this leads to immense runtimes already for one iteration. The alternative is to calculate individual measures from the trace. For the distribution adjustment moments and values of the distribution function or quantiles are used. For the representation of the correlation the adjustment of the autocorrelation coefficient is essentially used. With this approach, the trace only needs to be run once to calculate the measure values, but not to fit the parameters.
For distribution fitting, aggregation methods have been developed that significantly reduce the number of elements in a trace without distorting its characteristics. Thus, a methodology is available for distribution fitting to represent the required information from a trace in a compact form.
The situation is different for MAP adaptation. Experiments show that low-order autocorrelation coefficient fitting is not sufficient. There are also no aggregation methods that ensure that the correlation characteristics are preserved.
Using existing traces, we will therefore empirically investigate which measures are particularly suitable for describing the effects of correlation. For this purpose, trace-driven simulations will be used to investigate the behavior of models in which the real processes serve as input streams. Then, the same models will be run with MAPs that adjust individual characteristics of the trace. It is hoped that in this way, similar to distribution fitting, relevant measures can be defined and aggregation approaches for traces can also be developed.
A variety of methods now exist for fitting the parameters of a MAP to measured data. The idea of performing distribution and correlation fitting in two steps has proven successful: Formulating the problem as a general optimization problem with constraints and sequentially fitting distributional and correlation aspects, and treating the problem as a multiobjective optimization problem clearly resulted in better and more efficient algorithms to run than previously existed.
These two approaches to the adaptation of distributional and correlation properties for MAPs are to be combined in a hybrid algorithm. Thus, it seems reasonable to first modify both matrices in some iteration steps using evolutionary algorithms and then to use other optimization methods for fine tuning.
MAPs have so far been used primarily in analytical models. However, since they have the potential to represent correlations very well, it makes sense to integrate them into simulation models as well. So far, the following problems stand in the way:
The modeling capabilities of most simulators are limited to providing various distributions, so stochastic processes cannot be specified by default.
The parameterization of MAPs is complex and their manual specification of processes is laborious.
The long known stochastic processes like autoregressive processes are used by specialists in simulation. However, methods for parameter adjustment of such processes are not available even in widely used software tools for data modeling for simulation models such as ExpertFit or the Arena Input Analyzer. MAPs, as a recent development, have not received any attention and, to the best of our knowledge, have not yet been compared with autoregressive models.
The aim is to remove the aforementioned obstacles to the use of MAPs in simulation. In addition, an evaluation of the potential of MAPs in comparison to other processes is to be carried out.
One of the strengths of MAPs is that the superposition of MAPs and also the departure process of operator stations with MAP arrival process and MAP operator time distribution is again a MAP. Unfortunately, however, the state number of the resulting MAP increases exponentially with the number of MAPs considered or is infinite, as in the case of operator stations with unconstrained waiting space. This means that the resulting MAPs are no longer manageable. The goal is to develop a methodology that can be used to aggregate states of a MAP such that the resulting MAP provides lower or upper bounds for the dispatch process. Using a special partial stochastic ordering for the states of a Markov process, it can be ensured that starting from a stochastically larger state, all transient and steady-state power magnitudes are at least as large as the power magnitudes reached starting from a stochastically smaller state. If a set of states are replaced by the largest state in the set, a stochastically larger aggregate is obtained; if all states are replaced by the smallest, a stochastically smaller aggregate is obtained. The concept has been developed so far for general interacting Markov processes and it can be shown that under restrictive conditions the order is preserved under composition.
When applied to MAPs, some special features arise. For example, the behavior of a MAP is not affected by its environment. Furthermore, one is only interested in the process generated by the MAP, not in the state distribution of the MAP. Thus, the relatively restrictive conditions on the stochastic order can be softened, resulting in a larger potential of aggregable states. The result of such order-preserving aggregation is a MAP in which it is ensured that more/less arrivals are generated within each time interval than with the original MAP. This can be formally demonstrated by showing that the resulting departure process of the aggregated MAP is stochastically smaller/larger than the departure process of the original MAP. It then remains to show that the system, which has the MAP as input process, behaves monotonically with respect to the relevant performance variables as the number of arrivals increases.
Up to now, individual processes were considered whose time behavior was correlated. In some practical applications, however, correlations also occur between different quantities. Examples are the correlation of inter-arrival times and message sizes in computer networks or the correlation of the occurrence time of errors and the repair time. Simple MAPs are not sufficient for modeling such correlations; rather, different MAPs must be combined. For example, the simplest model to account for correlations between inter-arrival and service times consists of a MAP/PH/1 station whose service time depends on the inter-arrival time. This can be implemented by having the phase of the MAP from which the arrival occurred model the starting distribution of the PH distribution. This model can be interpreted as a system with n customer classes, where each phase of the MAP from which an arrival can occur defines a new class. Models of this type cannot be analyzed using standard algorithms for MAP/PH/1 models. Nevertheless, the special structure of the model allows a numerical calculation of the state probabilities. However, the current analysis approach still leaves a number of questions unanswered. For example, algorithmic aspects have not yet been considered at all, and it is also not clear for which model classes the analysis can be extended, since the previous approach only investigates simple discrete-time models.
It has to be investigated to what extent efficient algorithms for the analysis of MAP/PH/1 systems can be transferred to models with correlated inter-arrival and service times. Furthermore, it is to be investigated for which models an analytical approach based on known matrix-analytical methods is conceivable. In particular, multi-operator systems and systems with finite waiting space are considered.
The problem of parameterizing correlated MAPs or PH distributions has not been addressed at all in the literature. For adaptation, data streams have to be determined, each containing at least two values per datum. For example, the inter-arrival time and the message size can be measured. The goal of the parameterization is now to generate a MAP that models the inter-arrival times well. At the same time, a PH distribution (or MAP) must be found that models the message sizes. The relationship between inter-arrival time and service time demand is determined by the dependence of the starting distribution for service time on the departure phase of the arrival MAP. This relationship must also be modeled. Although the problem is significantly more complex than simply fitting a MAP, preliminary research suggests that methods for parameter fitting MAPs should be extensible.
Approximation and bound computation for tandem and fork-join networks with MAPs.
When MAPs are used as processes in models, the analyzability of the resulting models is the key aspect. Due to the so-called state space explosion, the size of the state space grows combinatorially with the dimension of the MAPs and PH distributions. Open systems can only be analyzed in terms of individual stations. In this respect, there is a great need to analyze queueing networks with non-exponential service and arrival times, at least approximately. There are numerous approaches that deal with such procedures. It turns out that the most commonly used decomposition methods are particularly suitable for systems with no or only little feedback. However, a general disadvantage is an unknown approximation error.
In the development of new and improved methods for the approximate analysis of queueing networks with MAPs and PH distributions, tandem and fork-join networks are to be the focus of investigations.
The second project phase focuses on an extension of the methods developed in phase 1 for MAPs to Rational Arrival Processes (RAPs). RAPs describe a superclass of MAPs and can no longer be interpreted as Markov chains. Due to this lack of interpretability, it is difficult for matrices to decide whether they describe a valid RAP. Therefore, the second phase of the project aims at a characterization of RAP matrices. For MAPs, equivalence relations and minimization algorithms can be defined using lumpability and stochastic bisimulation. By investigating the equivalence of MAPs and RAPs, similar relations and algorithms will be developed for RAPs as well. Stochastic Automata Networks (SANs) are one of the approaches used to specify Markov processes for performance and reliability analysis. SANs are a compositional approach that allows a concise description and is based on the fact that the composition of Markov processes in SANs describes Markov processes again. It is to be examined to what extent the composition of RAPs results again in a RAP, in order to be able to use Rational Automata Networks for modeling as a generalization of SANs.
In order to increase the practical applicability of RAPs, algorithms for parameter adaptation of RAPs will be developed and integrated into the existing toolkit ProFiDo. For the analysis of models with RAPs, methods for the use of RAPs in simulation and algorithms for the numerical analysis of models with RAPs will be developed in this context.