Data Stream Median

Numbers keep coming, return the median of numbers at every time a new number added.

Clarification

What's the definition of Median?

Median is the number that in the middle of a sorted array. If there are n numbers in a sorted array A, the median is A[(n - 1) / 2]. For example, if A=[1,2,3], median is 2. If A=[1,19], median is 1.

Example

For numbers coming list: [1, 2, 3, 4, 5], return [1, 1, 2, 2, 3].

For numbers coming list: [4, 5, 1, 3, 2, 6, 0], return [4, 4, 4, 3, 3, 3, 3].

For numbers coming list: [2, 20, 100], return [2, 2, 20].

Solution:

Python 默认是小顶堆的,所以自动排成1 2 3 4 5 6 peek 为第一个元素

可以 乘 -1 放入队列, 这样小顶堆自动变成了大顶堆, 取出的时候 * -1

大小 顶堆的思路,尽量让大顶堆最多只比小顶堆多一个元素,这样中值无论奇偶都是当前的大顶堆的peek()

12345 -> 6789 median = 5

1234 -> 6789 median = 4

这道题给我们一个数据流,让我们找出中位数,由于数据流中的数据并不是有序的,所以我们首先应该想个方法让其有序。如果我们用vector来保存数据流的话,每进来一个新数据都要给数组排序,很不高效。所以之后想到用multiset这个数据结构,是有序保存数据的,但是它不能用下标直接访问元素,找中位数也不高效。这里用到的解法十分巧妙,我们使用大小堆来解决问题,其中大堆保存右半段较大的数字,小堆保存左半段较小的数组。这样整个数组就被中间分为两段了,由于堆的保存方式是由大到小,我们希望大堆里面的数据是从小到大,这样取第一个来计算中位数方便。我们用到一个小技巧,就是存到大堆里的数先取反再存,这样由大到小存下来的顺序就是实际上我们想要的从小到大的顺序。当大堆和小堆中的数字一样多时,我们取出大堆小堆的首元素求平均值,当小堆元素多时,取小堆首元素为中位数

import heapq
class Solution:
    """
    @param nums: A list of integers.
    @return: The median of numbers
    """
    def medianII(self, nums):
        if nums == None or len(nums) == 0:
            return 0
        # write your code here
        maxheap, minheap = [], []
        res = [0] * len(nums)

        heapq.heappush(maxheap, nums[0] * -1)
        res[0] = nums[0]

        for i in range(1, len(nums)):
            x = nums[i]
            if nums[i] <= maxheap[0] * -1 :
                heapq.heappush(maxheap, nums[i] * -1)
            else:
                heapq.heappush(minheap, nums[i])

            if len(maxheap) - len(minheap) > 1:
                heapq.heappush(minheap, heapq.heappop(maxheap) * -1 )
            elif len(maxheap) < len(minheap):
                heapq.heappush(maxheap, heapq.heappop(minheap) * -1 )

            res[i] = maxheap[0] * -1
        return res

Last updated

Was this helpful?