The great overhaul

Keep it simple, stupid. Let’s see if we can do that: core is simple, add a bit of sugar on top.

Functions

So we have functions. Let’s see how that works.

function Foo,Bar do_it(Meh arg)
end

This makes a function in the module space. To call it:

do_it(arg)
arg.do_it()

You can also put a function inside a class definition, which only adds additional namespacing (must be prefixed with class name from outside the class):

object Foo
    function bar() end

    function baz()
        bar()
    end

    method blah()
        bar()
        baz()
    end
end

-- elsewhere
Foo:bar()
Foo:baz()

Note that inside a method’s body, we have an implicit self so there might be conflicts there:

object Meh
    method foo()
        bar(5)      -- could be functions bar(uint) or bar(Meh,uint)
    end
end

In D, the method of the object shadows the outside function. Makes sense. If you really want the function, you need to use its module’s path.

Methods

Method definitions are a simple syntactic shortcut:

object Foo
    method blah()
    end
end

Is exactly the same as:

object Foo
    function blah(Foo self)     -- for value classes, `self` is passed by ref
    end
end

Meaning that the method has the hidden self, and is defined in the class’s namespace. When using extend, same story.

Note that I mean “exactly the same” in terms of function prototype. In the body of the method, the self keyword can be omitted for methods to self.

Constructors and factories

They are just methods and functions with a special name, init, that is called when you use the class name as a function (sugar!):

object Foo
    function Foo? init(bool isRaining)
        if isRaining then
            return Foo()
        else
            return nil
        end
    end

    method init()
        -- do the thing
    end
end

-- later

local a = Foo(true)     -- is really Foo:init(true)
local b = Foo()         -- is really make a Foo and call method init() on it

Factories and constructors cannot have the same selector, obviously, and a constructor cannot return anything. Also constructors have this special semantic analysis that prevents you from reading from uninitialized members, or pass self as parameter (including sending message to self).

Interfaces

Interfaces cannot use “functions”, only methods, because the point of interfaces is to test compliance of classes to it.

interface UintMaker
    uint make(ArgType arg)
end

If there is a function uint make(Foo,ArgType), whatever way it is defined (in the class, outside the class, in extension), then class Foo complies to interface UintMaker. Period. Nothing changed.

References

Internally, references are a type, and in low level code they are handled differently than the type they refer to (by dereferencing). However, like in C++, references cannot be re-initialized: once set, they are a perfect alias of the original variable and any operation happening to them happens to the original one.

Question is: do we need a full blown reference type, to be used for members and as template arguments, or is the only reasonable use for passing and returning to and from functions?

If it is a full type, then we need to be able to declare it for locals:

function foobar()
    local myRef = ref something.that[might].be["quite"][far]

    myRef.do_this
    myRef.do_that
    myRef.dance_a_little_jig
    myRef = replacement
end

Note that “taking the reference” operation is automatic for the first “assignment” to a thing that is declared as reference:

function fiveIt(ref uint i)
    i = 5
end

-- later

local foo = 10
fiveIt(foo)     -- no need to write fiveIt(ref foo)

Not sure about members though:

object
    ref uint i

    method init(ref uint other)
        i = other       -- this is the first assignment, but how to know?
        i = ref other   -- more straightforward
    end
end

Of course, this ref operator cannot be applied to things that are not in the heap, like locals, because that would allow horrible horrible things like returning a pointer to a local, which scope is limited.

function ref uint blah()
    local i = 5
    local myRef = ref i

    return myRef        -- EEEEEEK!
end

Anyway, a local is a simple name, you never need a shortcut to it from the same function body. I think. Then again, you can definitely pass a local by reference, since it always oulives the inner function. Maybe the forbidden thing is only to return a reference to a local.

Now how about having references as template arguments:

local arr = array|(ref uint)
arr.push(somevar)

arr[1] = otherint       -- are we setting the array cell, or the thing refered by it?
arr[1] = ref otherint   -- is it clearer? or even possible?

Even the protoypes would be weird:

method ref T [](uint)           -- template
method ref ref uint [](uint)    -- applied template to uint

Ooops, can reference types be recursive? How can we possibly use the intermediary steps, since they are always autodereferenced?

This seems like a bad idea. The natural use for references is arguments and return values. Both are locals. We can add explicitely declared locals, why not, for nice shortcuts. But no other variables.

But they do participate to the functions’ prototypes, and therefore, to interface compliance.

interface Blah
    changeAnInt(ref int)
    ref text getSomeVar()
end

Cool stuff could happen for loops and iterators:

for i, ref v in arr do
    v = 5       -- no more of the annoying Lua way of reindexing arr[i]
                -- which also means no boundary check. FAAAST.
end

Final note: the name “reference” is well known, especially from C++. But it is a tad confusing that our object variables are also said to be references.

function(ref MyObject obj)
    -- obj is a reference to a reference to a MyObject
end

Another name could be alias. It abstracts the fact that we’re using pointers, and says “this is now exactly the same as that other variable”. Longer to type though, and “passing/returning an alias” is not exactly as well known as “passing/returning by reference”.

function fiveIt(alias uint i)
    i = 5
end

object Foo
    text blah
    method alias text getText()
        return blah
    end
end

Maybe.

Promoting values to generic

Generics use dynamic dispatch, and I always assumed that, like in C++, you could only apply this to pointers, and therefore, to objects. But references/aliases are in fact pointers to data, and it makes them look like objects. With a pointer to a value, and a bunch of method pointers, you could definitely cast a value to a generic.

The only tiny weeny problem is that the pointer has to be valid (obviously), so it would work only for values that are stored in the heap (array elements and object members). However, we know that there is a way to alias a local, by passing it “by reference”:

function a()
    local i = 50
    b(i)
end

function b(alias uint i)
    local gen = i to MyGeneric
    someobject.save(gen)
end

Oooh, bad. Here, someobject potentially remembers the address of the local i in function a, which is destroyed when a returns. Very bad. Two solutions: no casting values to generics, or somehow refuse to cast one if you don’t know where it comes from. The same way the compiler doesn’t let you return a local as alias, it won’t let you cast a local to a generic, only one that comes from elsewhere, because that’s guarantied to be from the heap.

Anonymous functions

We now have functions. So anonymous ones seem easy. But so far we had used anonymous methods as a shortcut to:

  • add a nameless method to a class
  • wrap an object of this class in a generic which demands that prototype
  • pass/save this generic (object pointer plus function pointer)

With plenty of syntactic shortcuts. That doesn’t work anymore for functions, which do not have an object. Can’t have a generic that acts on nothing.

We would need a new concept: the first level function. Or function pointer. Or “function interfaces”, which are just a way to invent a function pointer type.

The whole anonymous method and callback discussion was pretty clunky already. Maybe this needs a rewrite as well.

Note: anonymous methods could really be anonymous functions that close on self. Instead of having it as a hidden parameter, it could be a hidden upvalue.

In any case, for anything like this to happen in a clean and simple way, we would need function (pointers) as a primitive type. And a recursive one at that.

uint function(Foo,function(Bar))

Here’s a function that returns a uint, takes a Foo and a function that takes Bar. Also, syntax fun:

(array|Foo) function(Bar)
array|(Foo function(Bar))
int function(int) function(bool)    -- woo, a function that returns a function!

Grr. Also, the tiny problem that so far, every thing was an instance of a class, with methods, etc. Even arrays, that have a type parameter, interpreted in low level as templates. But functions would need variadic templates, because of their varying number of parameters.

THE HORROR. Is a function an object? Can I call methods on it? The only thing we really want from a function is to call it. So maybe a function is an object with a single method call and a bunch of arguments.

Then would function call be a shortcut for calling the method call on a function object, which is itself calling a function with the object as hidden self and… AAAAH. Madness.

Let’s wave this away for now. Maybe sool is not ready for this.

Templates

So far, templates applied only to whole classes, therefore affecting the member types, the method protoypes, and the methods’ code. Since now functions are actually quite independant from classes, could they be templates as well?

function foo|(any T) ()

end

With of course an “automatic” template if they are defined in template classes or extensions.

object Foo|(any T)
    method blah()
    end
end

-- is actually

function Foo:blah|(any T) (Foo|T self)
end

The semantic analyzer will have so much fun with these…

local f = Foo|uint
f.blah

Results in:

  • look for all functions blah(Foo|uint)
  • didn’t find any
  • look for templates that might match
  • found blah(Foo|T) with T is uint
  • is uint compliant with interface any?
  • yes
  • instantiate function template blah|(any T) (Foo|T self) for uint

Oh lord.

For the fun: can you have templates of templates? I guess so.

object Foo|(any T)
    method blah|(any U) ()
    end
end

-- is actually

function Foo:blah|(any T, any U) (Foo|T self)
end

Oh double lord. On the other hand, separating classes from methods might actually make things easier. When you have a class template, just generate the accessors as templated functions. Then all you have is functions, and the lookup can be the same in every case. Maybe.

In terms of compiler, reading the DMD compiler’s source, I noticed a pretty nice way to deal with templates: template declarations are basically just scope declarations, which bind a symbol to a (unknown yet) type. When analyzing the code within, you do type lookup just like for any other class, except that it resolves, in this scope, to the bound of the template parameter. This works transparently for classes, functions, extensions and whatnot.

In the instanciation part, when you use template parameters, you just make a template instance type, which refers to the original symbol (which in our case is always either a type or a function), and keeps a list of argument types to bind to the unresolved symbols.

It probably sounds easier on paper. But one has to start somewhere.

This entry was posted in sool. Bookmark the permalink.

Leave a comment