TRTC – Ray-Sphere Intersections – Chapter 5

Hitting my head on the wall…

Whoa. It’s a while. First I was lost in a side track implementing scriptability for my other macOS project. During that process I learned that sometimes you should just think a while, trust your self and skip search engine usage 🙂

Still the main reason why this one took so long was that at the end of Chapter 5 when there was only two test cases to go I hit the wall. Test just didn’t pass and I felt really frustrated. So I asked help from Jamis and thanks to his help I was able to figure out the first problem.

The problem was that when ever in my implementation returned points or vectors as a result I was counting on the approximation of the calculations is good enough and that caused the w components to corrupt slowly. The w was no longer neither 0.0 nor it was 1.0 at the end of calculations. That means the tuples where no more points or vectors. They where something g in the middle. And that of course caused other problems.

The bigger problem was still the fact that I was using Swifts Float (32bit) as my primary data type. Basically there is two main floating point datatypes in almost every programming language. There is float which is a single precision (32 bit) floating point data type and then there is double which is the double precision (64 bit) floating point data type. In Swift the names are Float and Double. So why was I using Floats in the first place? Well that’s a good question and I found out there was maybe three reasons.

First one was that the books suggest to use data structures that are simple. So somehow I was thinking that Floats are simpler than Doubles (Don’t make much sense, I know but that was like the immediate taught in my mind).

The other reason was that before knowing anything about how the books is going to guide me I was thinking that maybe I will need to use Metal in my implementation at some point and because all the examples I have seen so far that deals with Metal uses Floats. So that was kind of a mindset that was there already. That was something I didn’t think trough before moving on

Third reason that nailed the usage on the way was that until the end of this chapter I make every test case to pass without any problems. So I had no good reason to change anything. There might be a way to make this whole thing work with Floats, I don’t know. I got advised on the forum http://forum.raytracerchallenge.com that I should use double instead of floats. The advisor (“ascotti”) mentioned that even if it might be possible to get the ray tracer to work with floats “The pain is not worth the gain.”

So thanks to Jamis and “ascotti” I decided to change everything from Float to Double and that pretty much solved all the problems.

So back on the track

In this chapter we dive in into the ray casting. It means a creating a ray or a line and finding the intersection of that ray with the object in a scene.

From here on out, each chapter will culminate in something concrete, something visual, which will add to your growing store of eye candy.
Jamis Buck

Creating Rays

Ray have an origin and a vector that represents the direction of that ray. So the first thing is to create the ray data structure and test its inner values. Test case is simple like the actual implementation.

/**
 RGRay struct.
 Origin is the rays starting point and
 direction is where the ray is pointing to.
 */
public struct RGRay {
    public var origin: RGTuple
    public var direction: RGTuple
}

func testRGRayCreation() {
    // Given
    let origin = newRGPoint(x: 1, y: 2, z: 3)
    let direction = newRGVector(x: 4, y: 5, z: 6)
    
    // When
    let ray = newRGRay(origin: origin, direction: direction)
    
    // Then
    XCTAssertEqual(ray.origin.isEqualTo(origin), true)
    XCTAssertEqual(ray.direction.isEqualTo(direction), true)
    
}

So there is a starting point the origin and then there is a direction where the ray is pointing to the vector. Then we use t to describe a distance from the origin to a point along the ray. Based on the pseudocode I created a function that takes ray and t as arguments and returns the new point. Test and implementation is here:

func testRGRayPosition() {
    // Given
    let o = newRGPoint(x: 2, y: 3, z: 4)
    let d = newRGVector(x: 1, y: 0, z: 0)
    let r = newRGRay(origin: o, direction: d)
    
    // Then
    let p1 = newRGPosition(ray: r, t: 0)
    let p2 = newRGPosition(ray: r, t: 1)
    let p3 = newRGPosition(ray: r, t: -1)
    let p4 = newRGPosition(ray: r, t: 2.5)
    
    XCTAssertEqual(p1.isEqualTo(newRGPoint(x: 2, y: 3, z: 4)), true)
    XCTAssertEqual(p2.isEqualTo(newRGPoint(x: 3, y: 3, z: 4)), true)
    XCTAssertEqual(p3.isEqualTo(newRGPoint(x: 1, y: 3, z: 4)), true)
    XCTAssertEqual(p4.isEqualTo(newRGPoint(x: 4.5, y: 3, z: 4)), true)
}

/**
 This is for convinience to creare a ray.
 */
public func newRGRay(origin: RGTuple, direction: RGTuple) -> RGRay {
    return RGRay(origin: origin, direction: direction)
}

Intersecting Rays with Spheres

In the starting point all spheres are at location (0,0,0) and are unit spheres with radii of 1. This is according to the book so that things will be simple. When intersecting a ray with a sphere there are these different cases that depends on relation of the ray and the sphere. The whole sphere could be in front of the ray, behind the ray or the ray could just slightly touch the sphere at the tangent. Ray could also start from inside the sphere so that’s an other case too. I won’t put all the examples here. Like previously you can download the whole project from the gitlab.

The intersect function needs to return collection of t values. Like if the sphere is in front of the ray the first t is when the ray hits the sphere and the other t is when the ray hits the sphere when it comes out of the sphere after going trough it. All the examples are in the book. As an example I put the second intersect test case here which is “A ray intersects a sphere at a tangent”.

func testRGSphereIntersectAtTangent() {
    // Given
    let r = newRGRay(origin: newRGPoint(x: 0, y: 1, z: -5), direction: newRGVector(x: 0, y: 0, z: 1))
    // and
    let s = RGSphere()
    let xs = intersectRGSphere(s, ray: r)
    
    // Then
    XCTAssertEqual(xs.count == 2, true)
    XCTAssertEqual(xs[0].t.isEqualTo(Double(5)), true)
    XCTAssertEqual(xs[1].t.isEqualTo(Double(5)), true)
}

As a new thing here we need a sphere. Sphere will have origin, radius and transformation matrix. It will also have unique identifier. Because there could be many spheres in one scene we need a way to identify each one. To implement sphere struct uniqueness I decided to use NSUUID class which among other things have a method to generate unique id Strings. The RGSphere struct looks like this:

public struct RGSphere {
    public let id = NSUUID.init().uuidString
    public var origin = newRGPoint(x: 0, y: 0, z: 0)
    public var radius: Double = 1.0
    public var transform: RGMatrix4x4 = RGMatrix4x4.identity()
    
    public init() {
    }
}

So now we have sphere and a ray. Now we need a way to find out where the ray intersects the sphere.

This intersect function is probably the most complicated one so far even there is nothing really complicated in it. It’s quite straight forward math. At this point it looks like this. (At this point it works just well with Floats too.)

/**
 Calculates the itersections point for a ray to a sphere.
 */
public func intersectRGSphere(_ sphere: RGSphere, ray: RGRay) -> [RGIntersection] {
    var t: [RGIntersection] = []

    let p = newRGPoint(x: 0, y: 0, z: 0)
    
    let shpere_to_ray = ray.origin - p
    
    let a = simd_dot(ray.direction, ray.direction)
    
    let b = Double(2.0) * simd_dot(ray.direction, shpere_to_ray)

    let c = simd_dot(shpere_to_ray, shpere_to_ray) - Double(1.0)
    
    let discriminant = pow(Double(b), 2.0) - (Double(4.0) * a * c)

    if discriminant < Double(0.0) {
        return t
    }
    
    let t1 = (-b - sqrt(discriminant)) /  (Double(2.0) * a)
    let t2 = (-b + sqrt(discriminant)) /  (Double(2.0) * a)

    t.append(RGIntersection(t: t1, object: sphere))
    t.append(RGIntersection(t: t2, object: sphere))
    
    return t
}

At his point we have knowledge of the t value of the intersection. We still need a better way to keep trac of what is intersected. Like what kind of stuff there is at the point of intersection in terms of how we should deal with it. And for that we need another data structure.

Tracking Intersections

So now we must start keeping track of what is intersected. I created a RGIntersection struct based on the guidance of the book. At this point it’s quite simple.

/**
 Stores infromation of an object that was intersected at given location t.
 */
public struct RGIntersection {
    public var t: Double
    public var object: RGSphere
    
    public init(t: Double, object: RGSphere) {
        self.t = t
        self.object = object
    }
}

About the test cases I show here the one that aggregates the intersections. Mainly because I just wan’t do demo a Swift mechanism how to make function that takes variable number of arguments. So the test is here:

func testRGIntersectionsAggregation() {
    // Given
    let s = RGSphere()
    let i1 = RGIntersection(t: 1.d, object: s)
    let i2 = RGIntersection(t: 2.d, object: s)
    
    // When
    let xs = intersectinonsRG(i: i1, i2)
    
    // Then
    XCTAssertEqual(xs.count == 2, true)
    XCTAssertEqual(xs[0].t.isEqualTo(1.d), true)
    XCTAssertEqual(xs[1].t.isEqualTo(2.d), true)
}

And the implementation of the function that actually generates the array of intersection is here.

/**
 Generates and array of RGIntersections.
 Variable number of intersections could be given to function.
 Ex. let i = intersectinonsRG(i: inters1, inters2, inters3)
 */
public func intersectinonsRG(i: RGIntersection...) -> [RGIntersection] {
    var intersections: [RGIntersection] = []
    
    for intersection in i {
        intersections.append(intersection)
    }
    
    return intersections
}

At this point in the book the intersect function have returned only t values. But I have shown you the second version already which returns the actual objects of type intersection (RGIntersection in my code). In that point of view I have little cheated but I did it so that the code would look little more consistent in this post.

Identifying Hits

Thinking about the ray it kind of travels from its origin to the direction it haves and goes through everything on its way. But in terms of creating a scene or rendering a scene we need to know which of those intersections are actually visible. And that is all about this hitting things. So the visible intersection is called a hit.

Here is the last hit test case which have this little trick. As Jamis writes it’s the intersections() functions that must return the intersections in order from the one with lowest t to the one with highest.

func testRGHits() {
    // Scenario: The hit is always the lowes nonnegative intersection
    // Given
    let s4 = RGSphere()
    let i1D = RGIntersection(t:  5.d, object: s4)
    let i2D = RGIntersection(t:  7.d, object: s4)
    let i3D = RGIntersection(t: -3.d, object: s4)
    let i4D = RGIntersection(t:  2.d, object: s4)
    let xsD = intersectinonsRG(i: i1D, i2D, i3D, i4D)
    
    // When
    let iD = hitRG(intersections: xsD)
    
    // Then
    XCTAssertEqual(iD!.object.id == i4D.object.id, true)
}

I haven’t use the Swift language feature optional yet so much but here I think it might just be ok to use it. The hitRG funcion takes the array of RGIntersections in as an argument and if there is a hit it will return that particular intersection and if there is none it will return nil. I am not 100% this is the best way to do this but I guess it’s common for Swift. So the function is like this:

/**
 Hit will be the first intersection where t is positive.
 If t is negative the intersection is behind the origin.
 */
public func hitRG(intersections: [RGIntersection]) -> RGIntersection? {
    for i in intersections {
        if i.t > 0 {
            return i
        }
    }
    return nil
}

At this point we know how to find the rays intersection point at sphere and to store those hits. Next we will transform things.

Transforming Rays and Spheres

At this point all the spheres are at the origin. It’s reasonable to leave one there but all the rest in scene should be placed in some other location. Or at least their size should be different.

This change will affect the intersect function (intersectRGSphere in my case) because so far it have assumed that the sphere is at the origin and have the radii of 1.

It would be lovely if you could keep that assumption, while still allowing spheres to be resized and repositioned. It would make your implementation so much cleaner and simpler.
Jamis Buck

What will change is the distance from the rays origin to the object (sphere) and the relationship of the rays direction and spheres position. There are these nice hand drawn images in the book that demonstrate the different cases. You should definitely look at them.

So the actual “aha” idea here is not to move the objects (spheres) in the scene but instead move the ray. It will have the same effect. The distance to the object it intersects will increase or decrease and so will the direction from ray origin to intersected object change as the ray is put in another position. All the transformations you want to do to the objects in the scene you take the inverse of it and do it for the ray.

So first thing needed is a function that transforms a ray. This function should return a new ray rather than modifying the original one. So here we test that the ray is transformable.

You need to keep the original, untransformed ray, so that you can use it to calculate locations in world space later
Jamis Buck

func testRGRayTransformTranslation() {
    // Given
    let r = newRGRay(origin: newRGPoint(x: 1, y: 2, z: 3), direction: newRGVector(x: 0, y: 1, z: 0))
    // and
    let m = newRGTranslationMatrix(x: 3, y: 4, z: 5)
    
    // When
    let r2 = transformRGRay(r, matrix: m)
    
    // Then
    XCTAssertEqual(r2.origin == newRGPoint(x: 4, y: 6, z: 8), true)
    XCTAssertEqual(r2.direction == newRGVector(x: 0, y: 1, z: 0), true)
}

/**
 - Parameter r: RGRay to be transfomed.
 - Parameter matrix: The 4x4 transformation matrix.
 - Returns: New transformed RGRay.
 */
public func transformRGRay(_ r: RGRay, matrix m: RGMatrix4x4) -> RGRay {
    // Here we force the origin to be a point.
    var o = m * r.origin
    o.w = 1.0
    
    // Here we force the direction to be a vector.
    var d = m * r.direction
    d.w = 0.0
    
    return newRGRay(origin: o, direction: d)
}

Second thing to consider is that we need to assign the transformation to the sphere. The spheres default transformation will be the identity matrix which is a logical starting point. To test the assignment I write this test.

func testRGSphereDefaultTransfomr() {
    // Given
    let s = RGSphere()
    
    // Then
    XCTAssertEqual(s.transform == RGMatrix4x4.identity(), true)
}

func testRGSetSpheresTransform() {
    // Given
    var s = RGSphere()
    let t = newRGTranslationMatrix(x: 2, y: 3, z: 4)
    s.transform = t
    // Then
    XCTAssertEqual(s.transform == t, true)
}

And this was the breaking point for me. The final step was to add the functionality to transform the ray in the beginning of the intersect (intersectRGSphere() ) function and the use that inside the function to do the calculations. This was the point when Float values was no more useful. By using the Floats the discriminant in the intersectRGSphere() function went under 0 and nothing was returned. What I did, as I described earlier in the beginning of this post, I changed the Floats to Doubles.

This was actually quite easy. In Xcode I just replaced every Float with Double. Xcode have a very nice mechanism how to limit the search/replace scope. I choose the current workspace and used that. There was few float in the names that came from the simd which I needed to change too. Other than that it was actually quite fast thing to do. Maybe 5-10 min and all was working again.

So the test where all stopped for a while was this one:

/**
 Intersecting a scaled sphere with a ray.
 */
func testIntersectionScaledRGSphereWithRay() {
    // Given
    let r = newRGRay(origin: newRGPoint(x: 0, y: 0, z: -5), direction: newRGVector(x: 0, y: 0, z: 1))
    // and
    var s = RGSphere()
    
    // When
    s.transform = newRGScalingMatrix(x: 2, y: 2, z: 2)
    // and
    let xs = intersectRGSphere(s, ray: r)
    
    // Then
    XCTAssertEqual(xs.count == 2, true)
    XCTAssertEqual(xs[0].t.isEqualTo(3.d), true)
    XCTAssertEqual(xs[1].t.isEqualTo(7.d), true)
}

After making the changes it passed and at this point the intersect function looks like this:

/**
 Calculates the itersections points for a ray to a sphere.
 - Returns: Array of RGIntersections.
 */
public func intersectRGSphere(_ sphere: RGSphere, ray: RGRay) -> [RGIntersection] {
    var t: [RGIntersection] = []
    
    let tRay = transformRGRay(ray, matrix: simd_inverse(sphere.transform))
    
    /*
     The vector form the sphere's center, to the ray origin
     # Remember: the sphere is centered at the world origin
     */
    let p = newRGPoint(x: 0, y: 0, z: 0)
    
    let shpere_to_ray = tRay.origin - p
    
    let a = simd_dot(tRay.direction, tRay.direction)
    
    let b = Double(2.0) * simd_dot(tRay.direction, shpere_to_ray)

    let c = simd_dot(shpere_to_ray, shpere_to_ray) - Double(1.0)
    
    let discriminant = pow(Double(b), 2.0) - (Double(4.0) * a * c)

    if discriminant < Double(0.0) {
        return t
    }
    
    let t1 = (-b - sqrt(discriminant)) /  (Double(2.0) * a)
    let t2 = (-b + sqrt(discriminant)) /  (Double(2.0) * a)

    t.append(RGIntersection(t: t1, object: sphere))
    t.append(RGIntersection(t: t2, object: sphere))
    
    return t
}

Maby I will come back to this whole Float thing and study it more carefully. At the moment I am too impatient to jump to the next chapter.

Putting it together

The putting it together is implemented using the hints on the books. The code is in the playground of the chapter 5. So here is the result. Red filled circle. I was just so happy se it after all the frustration.

PS. In Finland we have this saying. “Well trouble is not like this, it is small, round and red”. In this context that was true for me anyway 😉