
Performance
Of The Automated Image Sequence Processing
To assess the feasibility
of automated line extraction with 3D positioning and consequently its real-time
realization, a rich set of the potential image processing functions was
developed in a standard C++ programming environment. Figure 1 shows the overall
dataflow and processing steps, which will be illustrated in more detail later.
In short, the real-time image processing is feasible due to a simple sensor
geometry and the limited complexity of the imagery collected.

Figure 1. Real-time image processing and post-processing workflow
First, a single down-looking camera acquires consecutive images with about 50% overlap, which cover only the road surface to the side of the vehicle. Then centerlines are extracted from the images, followed by feature point extraction around the centerline area. Finally, the feature points are matched to build a strip from the images. This matching process is greatly facilitated by the simultaneous availability of navigation data; basically the change in position and attitude between two image captures – relative orientation – has a dramatic impact on the search time for conjugate entities on image pairs since the usually two-dimensional search space is reduced to one dimension.
Traffic signs, centerlines and the like have distinct colors to draw attention of drivers in an unambiguous way. Therefore, color images are preferred over monochromatic ones. Figure 2 illustrates various cases, including an extreme situation where the yellow solid lines are hardly visible in the B/W image.

Figure 2. Centerlines of various qualities.
Although there are many
image processing algorithms working with various color data (multi-channel
gray-scale imagery), the great majority of the core functions work only on
simple monochrome image data. Therefore, a color space conversion is desirable.
In other words, moving from the 3D color space into one dimension, a color
direction, which shows the best possible separation for the objects we want to
distinguish. For example, moving from RGB to IHS (Intensity, Hue, Saturation)
space can effectively decouple intensity and color information. After some
experiments, it was decided to use an RGB to S transformation, which is
illustrated in Figure 3 under various road conditions. Obviously, dealing with
one channel has a major benefit for the real-time implementation.

Figure 3. Test
images and their RGB-to-S transformed representations.
After the RGB to S
transformation and filtering, the geometry of the centerlines is extracted from
the binary images. Under the given dimensions of the mapping vehicle and the
geometry of the camera, the centerlines, in transportation terms, are showing
up in the images as multi-pixel-width lines (20-30 pixels wide), usually
referred to as raster line in the vision community. For a raster line, its
centerline is of primary interest. In literature, similar terms such as
skeleton, or medial line (medial axis) are used. Skeleton is more like a
one-pixel-width line, while centerline can be used to express a one-pixel-width
line in both raster and vector lines. The mathematical definition of the
centerlines of a raster line varies depending on the algorithms used to
generate the centerline. In thinning algorithms, the skeleton is a collection
of such pixels, which has more than one nearest neighbor to the boundary of a raster
line. The skeleton is then extracted by shrinking the raster line from its
boundary in all directions until the one-pixel-width eight-connected line
remains. In the medial axis transformation method, the discrete medial axis
pixels are the local maximum of a transformation value.
For centerline extraction, we selected a scan line-oriented method,
which is simple and executes faster than most other algorithms. In this
one-pass process, the computation is linear to the number of pixels on a raster
line. Taking advantage of the known centerline direction, the optimal scan line
direction, which is perpendicular to the centerline, can be easily achieved for
most situations. During the scanning, the pixels along a scan line, which is a
small segment of an image column, are processed in a top-to-down fashion. A
robust recursive filtering technique can eliminate noise such as gaps, although
most of the gaps and grey-scale irregularities already have been removed during
the color space transformation, as well as provide segmentation for multiple
centerlines such as double solid lines. Figure 4 depicts the results of this
processing step.
Once boundary points are extracted, a line-following routine can generate the boundary lines, which are subject of further cleaning such as removing irregularities by applying geometrical constraints. In the final step, the midpoints are computed and the centerline is extracted as shown in Figure 5.

Figure 4. Centerline boundary points extracted.

Figure 5. Automatically extracted centerlines.
To achieve the highest accuracy possible, the 3-dimensional centerline positions must be obtained from stereo imagery. Knowing the camera orientation, both interior and exterior, and the matching (identical) entities between the 2-dimensional centerlines, the 3-dimensional centerline position can be easily computed. Since the external orientation is provided by the navigation data, and similarly the interior orientation can be determined by a priori calibration, the primary task is reduced to finding conjugate points or features in overlapping images. Since centerlines are subject to shift invariance, they cannot be used directly for matching purposes. There are a number of methods used for image matching, including feature-based and area-based techniques. Because of the special condition of the object space – near parallel planes – a simple correlation-based area method seems to be adequate for this purpose. For matching image primitives, feature points are considered.
![]()

Feature points correspond to high curvature or high gray level variety points.
For simplicity, a corner detector was selected to extract feature points, which
is based on the following operator:
I denotes the smoothing operation on the gray level image I(x, y). Ix and Iy indicate the x and y directional derivatives respectively. Figure 6 depicts feature points extracted around the centerline region from overlapping images.

Figure 6. Feature points extracted.
![]()
The matching of the feature points is accomplished through correlation. The
search space is constrained by the availability of epipolar geometry. For a
given feature point s, a correlation window of size n´m is
centered at its location in the first image, at point s1.
Then a search window around the approximated location of the same object point s
in the second image is selected, point s2, and the correlation
operation is performed along the epipolar line. The search window size and
location are determined by navigation data. The correlation score is defined as
Where Ik(uk, vk)
is the average at point (uk, vk) of Ik (k=1,
2), and l
is a normalizing factor so that the score ranges from –1, for two correlation
windows which are not similar at all, to 1, for two correlation windows which
are identical. A point in the first image may be paired to several points in
the second image. Several techniques exist for resolving the matching
ambiguities. Due to the special case scenario of a near planar object surface,
a 6-parameter affine transformation provides an adequate geometrical relation
between two images. Therefore, by calculating the affine transformation
parameters by conjugate points, straightforward blunder detection can be used
effectively for disambiguating matches and removing outliers.
After determining the transformation parameters between consecutive images, the centerline segments are connected and an approximate centerline can be incrementally formed. However, the final coordinates of centerline can be computed only in post-processing mode once the final navigation data have become available. To illustrate the fit between images, an image strip was built by transforming five consecutive images into the same frame as shown in Figure 7.

Figure 7. Automatically formed image strip.
Introduction| System design| Hardware Implementation| Performance of the Automated Image Sequence Processing| Positioning Performance