<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel rdf:about="http://dspace-roma3.caspur.it:80">
    <title>ArcAdiA</title>
    <link>http://dspace-roma3.caspur.it:80</link>
    <description>The DSpace digital repository system captures, stores, indexes, preserves, and distributes digital research material.</description>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://hdl.handle.net/2307/514" />
      </rdf:Seq>
    </items>
    <dc:date>2013-05-22T18:29:34Z</dc:date>
  </channel>
  <item rdf:about="http://hdl.handle.net/2307/514">
    <title>Root cause analysis and forensics in interdomain routing: models,  methodologies and tools</title>
    <link>http://hdl.handle.net/2307/514</link>
    <description>&lt;Title&gt;Root cause analysis and forensics in interdomain routing: models,  methodologies and tools&lt;/Title&gt;
&lt;Authors&gt;Refice, Tiziana&lt;/Authors&gt;
&lt;Issue Date&gt;2009-04-02&lt;/Issue Date&gt;
&lt;Abstract&gt;The Internet is an interconnection of administrative domains called Autonomous&#xD;
Systems (ASes). Each AS contains one or multiple destination networks and&#xD;
each network is identified by an IP prefix. The Border Gateway Protocol&#xD;
(BGP) [RLH06] is the de-facto standard routing protocol used to exchange&#xD;
reachability information among ASes and a BGP session between two distinct&#xD;
ASes is called peering. Each AS learns through BGP its "best" route towards&#xD;
each destination in the Internet,  updates it in response to network events (e.g., &#xD;
link failures,  router resets,  or policy changes) and propagates the change by&#xD;
BGP messages called updates. The propagation of BGP updates can be par-&#xD;
tially controlled via routing policy specifications.&#xD;
In order to investigate the Internet behavior over time,  several repositories&#xD;
provide historical data. Since 1997 and 1999,  respectively,  the University of&#xD;
Oregon RouteViews Project (RV ) [roua] and the RIPE NCC Routing Infor-&#xD;
mation Service (RIS) [roub] spread worldwide passive monitors (or vantage&#xD;
points),  which continuously gather BGP routing data from the Internet,  per-&#xD;
manently store them and make them publicly available. Currently,  there are&#xD;
about 800 such monitors. Also,  in 1995 the Internet Routing Registry (IRR)&#xD;
was established and started collecting inter-AS routing policies of many of the&#xD;
networks in the Internet with the main purpose to promote stability,  consis-&#xD;
tency,  and security of the global interdomain routing.&#xD;
As the Internet becomes a more and more critical infrastructure,  the need&#xD;
for understanding and (at least at some extent) controlling the interdomain&#xD;
routing increases. Internet Service Providers (ISPs) - in order to improve the&#xD;
quality of service offered to their customers - want to monitor the reachability of&#xD;
specific prefixes,  check the effectiveness of their own routing policies,  and assess&#xD;
the impact of traffic engineering configurations. In this context,  it is crucial to&#xD;
be able to detect and debug misconfigurations or faults,  in order to possibly&#xD;
fix them. More generally,  the problem of identifying Internet events,  locating&#xD;
their root causes,  and understanding their dynamics is attracting increasing&#xD;
attention from both researchers and network operators.&#xD;
However,  despite the large amount of research effort,  routing dynamics di-&#xD;
agnosis remains very difficult for several reasons: (i) The system has a sheer&#xD;
size. As of December 2008,  there are about 280, 000 prefixes and more than&#xD;
30, 000 Autonomous Systems densely connected between each other. (ii) The&#xD;
Internet is highly dynamic. In fact,  RIS' and RV's monitors currently receive&#xD;
an average of about 1, 500 BGP updates per minute,  with peaks of more than&#xD;
50, 000 updates per minute. (iii) Due to complex interconnects among ASes&#xD;
and routing policies,  the effects of network events are often separated (both in&#xD;
time and space) from their causes and different vantage points record different&#xD;
data in response to the same routing changes. Also,  multiple routing events&#xD;
can occur simultaneously. Overall,  given such size and dynamics,  "naive" ap-&#xD;
proaches to extract relevant information from the Internet routing data are&#xD;
neither effective nor efficient.&#xD;
Therefore,  both researchers and network operators interested in understand-&#xD;
ing the interdomain routing have to cope with several major challenges. First, &#xD;
in order to deal with such a huge and complex network,  they need to define&#xD;
what to measure,  i.e.,  they need a model of the Internet routing that captures&#xD;
the main dynamics,  filtering out the "noise" (e.g.,  routing changes that do not&#xD;
provide information relevant to the identification of network events). Based&#xD;
on such model,  they need a methodology that,  given the currently available&#xD;
data sources,  detects network events and infers when and where they hap-&#xD;
pened. Furthermore,  they need tools that efficiently handle the huge amount&#xD;
of data,  support the analysis of the network behavior over time,  and provide&#xD;
real-time information in order to spot and possibly fix outages as soon as they&#xD;
occur. Since the analysis of network events often requires manual work,  ef-&#xD;
fective paradigms for the visualization of routing data are also very helpful.&#xD;
Previous works leave most of these problems still open.&#xD;
The research work described throughout this thesis addresses these prob-&#xD;
lems and proposes approaches to (at least partially) solve them. Namely,  this&#xD;
thesis presents the following contributions.&#xD;
This thesis illustrates a new perspective to drive the analysis of the Internet&#xD;
dynamics without getting lost in the huge BGP dataset. Basically,  while previ-&#xD;
ous works usually address the root cause analysis from a "global perspective"&#xD;
- i.e.,  by taking into account the dynamics of the whole Internet and trying to&#xD;
identify major events affecting it - this thesis tackles the same problem with&#xD;
an ISP-oriented approach: it assumes that ISPs are usually more interested in&#xD;
the reachability of their own prefixes,  rather than in the status of the whole&#xD;
Network; hence,  it focuses the analysis on user-specified prefixes and corre-&#xD;
lates their behaviors to the global Internet dynamics. In particular,  this thesis&#xD;
formally models the Internet as a flow-based system,  where monitors are the&#xD;
sources of the flows and ASes originating BGP updates are the sinks. This&#xD;
thesis also defines a methodology which correlates such flow variations to rout-&#xD;
ing changes in order to spot network events and the root causes that triggered&#xD;
them. Furthermore,  BGPath has been developed to support this methodol-&#xD;
ogy and this thesis describes its main features. BGPath is a publicly available&#xD;
tool that uses BGP data collected by the RIS and the RV projects and pro-&#xD;
vides the user with routing information from both a single and cross-vantage&#xD;
point views. BGPath also assesses the reliability of the collection system,  in&#xD;
order to avoid measurement artifacts. The algorithms BGPath relies on are&#xD;
shown to efficiently process huge streams of BGP data,  fulfilling nearly-real&#xD;
time constraints.&#xD;
While the ISP-oriented approach presented in this thesis gives a good in-&#xD;
sight on both major and minor events affecting specific portions of the Internet, &#xD;
approaching the root cause analysis problem from a "global perspective" usu-&#xD;
ally does not provide with such fine-grained results. On the other hand,  the&#xD;
global approach is critical to identify major interdomain events,  without any a-&#xD;
priori knowledge of the prefixes and/or the ASes involved. This thesis explores&#xD;
this perspective too. Specifically,  this thesis proposes a novel methodology&#xD;
based on the Principal Component Analysis (PCA),  a well-known statistical&#xD;
technique that is commonly used to reduce the number of dimensions of multi-&#xD;
dimensional datasets in order to highlight the most significant trends of the&#xD;
data. Since the interdomain routing dataset is inherently multi-dimensional&#xD;
(in time,  space,  prefixes,  observation points,  ...),  this thesis suggests to apply&#xD;
the PCA to this dataset in order to identify the most significant contributors&#xD;
to the Internet dynamics.&#xD;
BGP data collected by RIS' and RV's monitors provide a detailed view&#xD;
of the actual status of the interdomain routing. However,  it does not report&#xD;
all the inter-AS peering relationships which are not active. For example,  in&#xD;
"normal" conditions,  backup links do not appear in the routing tables. Still, &#xD;
in order to understand the reasons behind some network events and to pre-&#xD;
dict the evolution of the routing when an event occurs,  such information is&#xD;
actually very important. To cope with the intrinsic limitations of the RIS and&#xD;
RV dataset,  this thesis analyzes the data stored in the Internet Routing Reg-&#xD;
istry and describes how to extract peering relationships from routing policies&#xD;
collected within. Moreover,  the proposed approach specifies how to solve in-&#xD;
consistencies among the distinct databases the IRR consists of. The obtained&#xD;
results show that - even though the IRR data is often out-of-date,  it still pro-&#xD;
vides a quite unique amount of topological information which usually does not&#xD;
appear in the global routing.&#xD;
The research work described in the thesis relies on the assumption that&#xD;
Internet is a graph where ASes are atomic entities in the interdomain rout-&#xD;
ing. However,  recent papers [MFM+&#xD;
06, MUF+&#xD;
07] show that such a model can&#xD;
mislead the understanding of the global routing behavior. Thus,  this thesis in-&#xD;
vestigates this problem by measuring the route diversity that can be observed&#xD;
by passive remote vantage points,  defining a methodology to compute it from&#xD;
a dynamic BGP dataset and characterizing it in terms of location of ASes in&#xD;
the Internet customer-provider hierarchy and choice of monitors.&#xD;
The thesis documents forensic analysis of two well-know events that oc-&#xD;
curred at the beginning of 2007,  where models,  methodologies and tools de-&#xD;
scribed in the thesis are exemplified using real case studies.&lt;/Abstract&gt;</description>
    <dc:date>2009-04-01T22:00:00Z</dc:date>
  </item>
</rdf:RDF>

