Invent


Making MySQL Searches Relevant

One thing that annoys me about some websites is search functionality, and the lack of relevant results that they return. In theory a search feature is a relatively simple concept, you type a keyword for what you are looking for, and a list of matching results are provided. While this looks pretty simple on the front end, the back-end of search queries is a complicated process.

For me, when creating a new search system, it’s always best to start it out pretty simple. A simple database query to find a simple match in a simple database field. Let’s use a recent project I’ve been working on as an example.

Let’s assume our table looks like this.

+-------+------+--------+----------------+-------+---------------+--------------+------------+
| entry | item | system | title          | extra | description   | tags         | date       |
+-------+------+--------+----------------+-------+---------------+--------------+------------+
|   11  |  47  |   2    | Anthonys Video | 23412 | anthony sho.. |anthony,kinson| 1346172183 |
SELECT *
FROM tag_items
WHERE title='Anthonys Video'

To simple right? The reason for this is that I want to just return something, anything at all, so that I can start programming and styling the front end to start receiving data for output. At this point there is no sense in me creating complex queries because I’m not absolutely certain that the table or front end will continue to work the way it is at this point. However, once I have completed my front end and have it all functioning we knock things up a step, and this next step is usually where most websites leave it.

SELECT *
FROM tag_items
WHERE title LIKE '%Ant%'
ORDER BY date DESC
LIMIT 0,10

So as you can tell from this query, we’re searching the table for items which have a “Title” like “Ant” (where Ant is the search term) and orders them so the most recent entries will show first. Some websites will give various options for this type of search, such as a drop-down offering to search by Title, Description or tag words. This is all very well for a lot of sites as they don’t need anything more expansive than this, especially when the user knows exactly what they are looking for.

The problem with this is that with today’s internet, search is used a lot as a discovery portal, a way for users to find and discover new things which are of more relevance to what they are looking for. To do this, we need to offer a search that doesn’t just deliver content to a user based on a single factor such as a title.

At Duxter, to tackle this problem I devised a simple relevance and weighting system, we take each possible search field, and we assign it a value. In this case we’ll use the following where a higher number is more relevant.

Exact “title” Match = 100 points
Partial “title” Match = 10 points
Partial “tags” Match = 7 points
Partial “extra” match = 5 points
Partial “description” match = 2 points

Let me explain these values. If you search for “Big Foot Spotted” then chances are you know what you’re looking for. If an entry is found with the exact title “Big Foot Spotted”, this will be highly relevant so we give it an excessively high score so that it’s listed above all other results which may only of had partial matches. For example, if we had an entry called “Big Spotted Foot” it is relevant, but probably not what the user was looking for. For this reason we give partial matches of a title 10 points. We then take into account other factors, such as what tags the item has, extra data supplied, and the description.

Tags are usually a good way to find content based on keywords, but tags get reused a lot and often assigned to things with little relevance, so we consider these less important than the title of the item. Extra, for us, stores extra identifying information, such as video ID’s and object keys, we only use this in our search queries so that you can search for items based off of a video id . Lastly is our lowest priority, the description. The reason for this is because descriptions often contain a lot of words so there’s more chance of matching something irrelevant.

The way we handle this with an SQL statement is as follows.

SELECT *, sum(relevance)
FROM (
	SELECT *, 100 AS relevance FROM tag_items WHERE title='Anthony'
	UNION SELECT *, 10 AS relevance FROM tag_items WHERE title like '%Anthony%'
	UNION SELECT *, 7 AS relevance FROM tag_items WHERE tags like '%Anthony%'
	UNION SELECT *, 5 AS relevance FROM tag_items WHERE extra like '%Anthony%'
	UNION SELECT *, 2 AS relevance FROM tag_items WHERE description like '%Anthony%'
) results
GROUP BY entry
ORDER BY sum(relevance) desc

So here I have created a number of select statements that will grab items where a field matches the search term. Depending on the match that is found a relevance value is given and all the results get grouped together by their entry id. We then sort them by which item has the highest relevance count.

an example of a returned result would be

+-------+------+--------+----------------+-------+---------------+--------------+----------------+
| entry | item | system | title          | extra | description   | tags         | sum(relevance) |
+-------+------+--------+----------------+-------+---------------+--------------+----------------+
|   11  |  47  |   1    | Anthony        | 23412 | Anthony sho.. |anthony,kinson|      109       |
|   25  |  22  |   4    | Anthonys video | 23876 | This is ant.. |anthony,video,|      19        |
|   83  |  63  |   4    | Last Nights g. | 23747 | Anthony loo.. |anthony,game,l|      9         |
|   22  |  13  |   3    | Seattle even.. | 92834 | Our event in. |seatle,anthony|      7         |

So as you can see above, each item was given a combined relevance score, the higher the score the more relevant the item was to the search query. Searching for Anthony in this case we found an exact match in system 1 (which is a user). If you searched Anthony, chances are that the user with the name Anthony is who you are looking for and will be the most relevant result to show you, we then found a video (system 4) which had a partial title match, a tag and a description match, and so on and so forth.

Now, the system I have set up for Duxter is much more complicated than this, I’ve tried to simplify it as much as possible to show how it works. We in fact have a separate table with indexed tag words and keys and we return the top 5 results from each “system” in our quick ajax search feature. things start getting real complicated at this point, so we’ll save that for another day. for now though, I hope this look into making searches a bit more relevant comes in handy for you.

Until next time.

Anthony

Read more...


The Legends of Brutal Legend

Brutal legend is one of my all time favorite games, and as i recently purchased one of those awesome Hauppage HD PVR2′s I thought I’d take the opportunity to capture some footage during a new play through.

The below video contains the stories from all of the hidden “Legends” throughout the game. I’ve combined them all into one 15 minute story about the history of the Brutal Legend World.

Read more...


Final Hurt & Final Fight (Final Fantasy VII: Advent Children)

A few years ago I made a couple of little fan videos for the film “Final Fantasy VII: Advent Children”. However, at the time I only owned the standard definition DVD of the movie and whilst I liked the videos I made I felt it only right as a massive Final Fantasy fan, and owner of the Advent children Blu-Ray, to update my videos is full high definition.

Before we get to the videos let me start off by saying I do not own the rights to any of the video content or music played in these videos. All audio and video is copyrighted and property of their respective owners.

Final Hurt
This video tells the story of a man who loses the person he loves most in the world and the emotions entailed, going back to a child like state, feeling helpless and not being able to change things. Being alone and cut off from the rest of the world.

Final Fight
This second video is simply and adrenalin filled kick ass journey.

Read more...


Founder, Co-Founder, Partner, Owner

Something I hear a lot is “what does partner mean? whats the difference between a partner and a co-founder, and whats the difference between a co-founder and a founder”

Quite simply, they all mean the same thing but different people and companies use them in a different manner of ways. Let me explain.

When somebody is a Founder, it usually indicates the company was started by a sole person. An exception to this was at Facebook where both the titles of Founder and Co-Founder were used because Mark Zuckerberg wanted to distinguish himself from all others that claimed/litigated status as founders. This has been seen at other companies when someone is granted founder status later in the formation of a company and hence only considered a Co-Founder vs. a Founder. Another example of this usage is to denote the person who worked on the idea prior to bringing in the other “co-Founders.”

For me in general, if there’s multiple founders, all are/should be classed co-founders, if there is only one then a single founder title should be used. Although you can refer to all co-founders as founders, but I find the dual-class structure a bit distasteful and disingenuous (bad Mark Zuckerberg). Some companies prefer to use the title “Founder” for all people entitled to the title. One of the reasons for this is that when asked a Founder can say the company was founded with a partner or team. This helps avoiding potentially uncomfortable questions later on if a partner decide to leave, and in the case where there are only 2 Founders, prevents you having to change the title from Co-Founder to Founder if said partner leaves.

This then leads to the titles “Partner” and “Owner”, it’s pretty much the same thing. In my opinion a person can use either Founder, Co-Founder, Partner or Owner in their title if they are an actual Founder, Co-Founder, Partner or Owner. Usually early employees at a company don’t really get founders shares of equity so that is how I distinguish the difference.

I hope that has helped explain my understanding of the titles, this stuff isn’t really my thing, but I have learnt a bit about it over the last year. Let me know your thoughts and how you distinguish the titles. I’d really be interested to hear.

Read more...


Adventures of Adam & Unicorn

So at Duxter, our CEO Adam owns a Mac, and on that Mac he has key protectors on his keyboard. However, these key protectors are rainbow coloured, and ever since I discovered them I’ve been making fun of Adam in the form of Unicorn jokes.

So last week I decided I should have a Unicorn cuddly toy mailed to him at the office anonymously and have my Partner in crime (Our CTO Sky) capture the moment on camera. Well today that moment finally arrived and sky was able to capture some pretty awesome pictures of Adam unwrapping his new toy. Upon viewing the pictures I thought “hey, this would make a pretty awesome comic strip!” and took it upon myself to open up Photoshop and proceed to turn the whole event into a comic strip.

(This image is around 8.8MB and was created at high resolution so we could print it for the office)

Hopefully we’ll be able to snap up some more pictures of Adam & Unicorn around the office and continue to bring you comic book slides.

Read more...


This Week At Duxter

This week has been a pretty fun week for me, I’ve solved some major problems and created some fantastic things.

Firstly, if you’re a Duxter user, you will know that we have recently been having outbreaks of epic slowness and sometimes the site not loading at all. Initially when this started happening we concluded that it had something to do with Node (our websockets type server) and its communications with our database server. what we had done at this point was started pooling the database connections in node hoping that it would improve things, and it did, slightly, and for a short time. the improvements we seen were simply false indicators of us finding the solution, when in fact the improvements in performance we seen was simply down to the fact it was late at night (for me) and activity was low on the site.

It soon became obvious that our previous fix added some slight improvements but didn’t solve our overloading issues. It soon became a real problem as we were having to reboot the node daemon almost every hour. As all our developers were currently knee deep in other projects, rebooting node because a sort of temporary solution until we had time to look deeper into it… that was until one morning I was trying to work and i was having to restart node every 5 minutes. Now, as almost all of our team is based in the United States (and me being from the United Kingdom) I was all alone and had no choice but to investigate this problem myself. After a bit of stress and some Sherlock Holmes style investigation, I finally found the problem. What made this difficult to track was the fact certain services are hosted on different servers, so even though our production server was showing as being healthy and under no stress at all, the site was taking a really long time to load. So after ruling out our production server was having problems I decided to look elsewhere, Node debug logs shown no issues, and no database errors were produced. At this point I started looking the active connections and CPU usage of our database server. Bingo! there it was, 99.8% CPU usage and an average of about 150 connections at any given time. Unfortunately that was all the information I could obtain from there so I dived straight into the database and ran the “SHOW PROCCESSLIST” query. And right there, the same query running hundreds of times. So after a bit more tracking I’d located it to being an issue with our Instant Messenger system, specifically the friends list functions which run through node. Disabling the IM system instantly reduced the Database CPU usage to nearly nothing, and the active connections dropped to about 5 average. At this moment in time the site and servers are all running fantastically well with not so much as a hiccup. We’ve now located the piece of bad code that caused this, and now have one of our developers making several improvements to the IM systems. So, with that all diagnosed we shoudl have a new version of our IM system running shortly, which I can’t wait for.

So, what have I been working on this week? Well for me, one of our most exciting features, Our Developer Platform. This is the place where third parties can create and publish apps both internally on Duxter for Duxter users to use, externally on their own websites and in native applications for the likes of Windows, Mac OSX and mobile devices. I wont go into to much details about how this is all going to work for end-users and developers at the moment as we’ll be doing a public announcement when we’re ready to launch it. This for me however has been one of my favorite projects for the simple fact creating frameworks and laying down the foundations for such a system is a real challenge. I thrive on achieving the impossible, tackling difficult challenges and pushing my abilities to the absolute max. It’s a great change from what I’ve been working on lately, such as creating “features” for Duxter, whilst I do enjoy that stuff, I find it a bit trivial. I mean, within a few minutes of being proposed an idea of feature, i can see exactly how it is laid out, how it should work and what needs to be done, it just becomes a case of writing the code and tackling any bugs that show up, a perfect example of this is our “Rewards Store” (which I’ll get to in a moment). Creating a system from scratch though, now that takes a bit more thought and with every new class created there’s that awesome sense of accomplishment.

So, the Rewards store. You may have noticed we finally launched our rewards store on Duxter, and this system has been a real pain for us. Initially when we lay down the plans for how things should work on Duxter, and we decided that we should use a pre-built eCommerce system to save us time and allow developers to spend more time on other features. This would require a few things, to cut the story short, using an SSO (single sign on) solution caused slight overhead, we were having to use API’s in order to use user information, and the eCommerce system required a bunch of modding to get it to do what we needed. Seriously, as great as this system was stand-alone, integration was an absolute nightmare. We dropped it last week and said farewell to it for good. After a short meeting with Adam we had a list of primary features of what our Reward Store should do. We laid down a plan, pulled in Robert and Ross (2 of our developers) to start working on certain features and within 2 days we had a fully functional eCommerce system ready to go live. Whilst it’s still in it’s early stages and we have a bunch of things to still add, completing this within 2 days was a real achievement and really shown off each of our developers talents, in fact, it took longer for us to populate the store with products than it did to create the system.

On the business side of things, our CEO Adam has been real busy this week (seriously, the guy is never off the phone). We’ve acquired a second office in Seattle, which will give our CTO Sky a new battleground for his Nerf gun antics (the guy is obsessed with Nerf guns). Adam has also been real busy meeting new start-ups and creating relationships with these companies, and existing companies, but as geeky as I may be, all this business stuff is way over my head so check out his blog if your more interested in the business side of things, he doesn’t get much time to blog about this sort of stuff, but he does try.

So that’s about all I have to share with you this week, there’s been so much happening in all aspects of Duxter my brain is just overloaded with things to tell you about. If your not already a member on Duxter, get yourself involved at http://duxter.com seriously, if your a gamer, you NEED to be here.

Until next week peeps.

~AK

Read more...


A Message to Movie & Record Companies

First of all let me start off by saying, I appreciate and respect Copyright laws, and the reasons for them. I myself as a digital content provider owning copyrighted material understand the importance and reasons for copyright protection.

Which brings me to the whole purpose of this article.

Check out some of my blu-rays. I think its safe to assume I’m a big movie fan. The amount of DVD’s I own is probably 5 times as big as my blu-ray collection. Unfortunately, I doubt I’ll be buying any more. Why? Lets start off with my experience with the likes of YouTube and Facebook over the last couple of years.

So, I’m out at a fireworks display, I record it on my phone to show my family who couldn’t make it, and upload it to YouTube. Within 5 minutes my video is taken down, and i receive warnings. do you know why it was taken down? because somewhere in the distant background a song was playing, and I didn’t own the copyright to that song!

This has happened MANY times, my videos being blocked and my account risking deletion / suspension, because a part of some song is playing in the background!

Today I create a little fan video using content from a Blu-Ray I purchased. Again, blocked on 3 copyright notices within 2 minutes of it being uploaded.

Like I say, I understand the reasoning behind copyright, but this is just plain pathetic! Sure piracy is an issue, but the whole movie and music industry has turned into an absolute disgrace. Using excuses like “we’re losing so much money due to piracy”. Ok first of all, if the actors were not paid so damn much, maybe your profits would be higher. Seriously, people being paid millions for prancing around in front of a camera, and your complaining about profits, trying to force governments to pass unethical laws. What the hell has the world come to?!

You say each pirated movie and album download costs you a sale, well, your unethical tactics, greed and absolutely ridiculous behavior has cost you more losses from me, than you would of lost from 100 illegal downloaders who wouldn’t of even bought what they downloaded in the first place. At this moment in time, I’d rather give my money to a local unemployed guy selling bootleg dvd’s on a street corner than I would purchase from a store.

That said, I wont be purchasing movies or music from certain companies anymore. I have Spotify and Netflix, I can get my entertainment fix there.

To those actors, artists and companies out there who do appreciate their fans and don’t act like greedy money grabbing fuck nuts, thumbs up and a smiley face to ya, much respect. The rest of you, well, I’m sure I can find a nice post for you to go swivel on.

Read more...


Removing Babylon From Firefox

First of all, if your reading this, chances are you have been invaded by the scum which is Babylon Search / Toolbar so you will completely understand the frustraion this piece of crap has caused me.

After searching for hours on information on how to remove this I found no solution which worked. The horrible little thing kept returning. Mainly in the form of the Address bar keyword searching.

So, here’s a quick quide on how to remove it.

Firstly i’m going to assume you have done all the obvious things, uninstalling any programs related to babylon, removed the search plugins from firefox and so on. so lets move on from there.

Step 1:
navigate to…
(64bit) C:/Program Files (x86)/Mozilla Firefox/searchplugins/
(32bit) C:/Program Files/Mozilla Firefox/searchplugins/

and delete the file babylon.xml and any other refferences to babylon you may find.

Step 2:
Open firefox, and in your address bar type “about:config” (without the quotes)
click to procedd past the warning and in the filter bar, type in babylon.

You should see a bunch of entries as shown in the screenshot below. right click each one of these and choose “Reset”

Your done.

The horrible piece of crap that is Babylon should no longer cause you any problems. If there are other places that this thing hides leave a comment and let me know and i’ll update this post. Time to rid the internet Babylon.

Read more...


Battlefield 3: Paris Multiplayer Gameplay

Let me first start off by saying that Battlefield 2: Modern Combat was on of the first online games i’d ever played, after Hardware: Online Arena. This was back in the day when I had the slimline Playstation 2 and getting online was a major hassle. However, Battlefield 2: Modern Combat was the game that really got me into the first person shooters, in fact, the day I bought my XBox 360, I also purchased Battlefield 2: Modern Combat…

Read more...


Dead Rising 2: Pleasantly Surprised

So months after release i finally decide to buy Dead Rising 2 due to boredom. After already owning and enjoying the first installment of Dead Rising i thought i may as well give it a go, plus, i picked up a copy for £17.00 so you can’t complain with that…

Read more...

Visit also our social profiles:

Scroll to top