SQLite Forum

[help] need to increase efficiency and obtain the maximum value from multiple selects
Login

[help] need to increase efficiency and obtain the maximum value from multiple selects

(1) By Adam (astranberg) on 2021-11-08 23:32:30 [link] [source]

Hey all,

First of all, I really appreciate everybody's help. I'm new to SQLite and I'm sure there's a more efficient way to do what I'm wanting. I tried executing the first example below, but after four hours it still hadn't finished. And it doesn't even do half of what I want!

UPDATE players
SET max_trophies =
	(SELECT winner_trophies
	 FROM battles
	 WHERE winner_tag == players.player_tag
	 ORDER BY winner_trophies
	 LIMIT 1)

UPDATE players
SET max_trophies =
	(SELECT loser_trophies
	 FROM battles
	 WHERE loser_tag == players.player_tag
	 ORDER BY loser_trophies
	 LIMIT 1)


Here are the tables for reference:
CREATE TABLE players(
        player_tag TEXT,
        update_date TEXT,
        max_trophies TEXT,
        UNIQUE(player_tag))

CREATE TABLE battles(
        battle_time TEXT,
        battle_type TEXT,
        winner_tag TEXT,
        winner_trophies INTEGER,
        winner_decktype TEXT,
        winner_cards TEXT,
        loser_tag TEXT,
        loser_trophies INTEGER,
        loser_decktype TEXT,
        loser_cards TEXT,
        UNIQUE(battle_time,winner_tag,loser_tag))

My goal is to update the max_trophies in the player table to the highest trophies they have been attributed in a list of numerous battles. This means to select winner_trophies where winner_tag is the player_tag AND to  select loser_trophies where loser_tag is the player_tag, then set the winner_trophies to the maximum value. One step further would be to only do this if the maximum value is greater than the current max_trophies.

I'm reading into the MAX and HAVING clauses and will update if I have other ideas.

Thanks!

(2) By Adam (astranberg) on 2021-11-08 23:50:44 in reply to 1 [source]

I think the use of max and group by will help here:

UPDATE players
SET max_trophies = 
	(SELECT
		max(winner_trophies)
	FROM
		battles
	WHERE
		winner_tag == players.player_tag
	AND
		winner_trophies > max_trophies
	GROUP BY
		winner_tag)


UPDATE players
SET max_trophies = 
	(SELECT
		max(loser_trophies)
	FROM
		battles
	WHERE
		loser_tag == players.player_tag
	AND
		loser_trophies> max_trophies
	GROUP BY
		loser_tag)

Is there a more efficient way?

(3) By ddevienne on 2021-11-09 07:04:56 in reply to 1 [link] [source]

It's not your queries the main problem, it's your indexes.
You are joining on winners and losers, which are not indexed in a usable way.

Just switching around the battles' UNIQUE index to put winner_tag or loser_tag first,
will speed up one of your two queries. The other should be indexed separately (non-unique).
Or keep your UNIQUE index as-is, and index both battles' winners and losers non-unique separately (i.e. 2 indexes).

From the queries you are showing, seems like these two indexes would help most:

CREATE INDEX battles_winners on battles (winner_tag, winner_trophies);
CREATE INDEX battles_losers on battles (loser_tag, loser_trophies);

That way, joining will be fast (tags are first), and the index contains
the desired trophies in order, so the ORDERY BY LIMIT 1 (or just MAX())
should be fast, and no need to access the battles table at all.

Just read up on explain query plan to compare your queries' plans with various indexes.
See also the CLI's .expert command.