Welcome to the Madness

After a while of developing Clew I realized it would probably be a good idea to start a development blog so that you can trace the collapse of my sanity in real time (why are emojis allowed in URLs anyway??? đŸ˜©).

what to expect

I have no idea. We’ll have to figure it out together. I’ll probably post progress updates here, so you can be sure that I’m not just spending my days eating chips and playing video games instead of reinventing the wheel like I’m supposed to. Perhaps the occasional announcement.

If nothing else, I plan to try and make it entertaining. So hey, follow along. There’s an Atom Feed to get updates as they come out, and if you don’t know what that means, read this excellent post by Hund (#DiscoveredWithClew) to learn all about it.

current status

So as not to leave you empty handed, here’s a quick update on what’s going on right now.

I just finished coding a multithreaded, multi-queue task prioritization system for Ariadne, Clew’s crawler, and somehow (somehow) it worked first try. With that out of the way, I’ve set the crawler running on my server and the index is steadily growing.

There’s still a little work to do before releasing the crawler’s code publicly on Clew’s Codeberg organization, but this was the biggest blocker to getting that done, so it shouldn’t be long now, assuming no mental breakdowns on my side.

Once that’s all wrapped up and the crawler is running at full speed, I’ll be starting work on a refactor of the query parsing and matching logic for searches.

You see, one of the current issues with Clew is that it can rank how strong matches are given keywords, but it doesn’t actually have a way to gauge whether a result actually fully matches. For example, a search for “benjamin hollon” could have a page that just says “benjamin” over and over at the top, even if it never says “hollon”, because the strength of the match for “benjamin” is so high.

Once this refactor is finished, results will first be sorted by the completeness of the match, then by the strength results match the keywords with.

It sure won’t be easy, but it should drastically improve the results you get when using Clew.

bye

Okay, I need to get back to bashing my head against the brick wall that is the Clew codebase. See you next time!

Return home ↩


Benjamin Hollon

Benjamin Hollon is Clew’s creator. When not reinventing the wheel, Benjamin writes for his numerous blogs, crafts stories, plays and composes trombone, travels the world, commits atrocities in the terminal, runs a social media site, codes, studies Communication and Professional Writing at Texas A&M, forgets his family’s birthdays, gets locked into the library (not realizing it’s closed), and generally goofs around.

You can financially support his work, including the development of Clew, on Liberapay.

Email Public Key