> For the complete documentation index, see [llms.txt](https://liuxue2010.gitbook.io/data-structure-and-algorithms/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://liuxue2010.gitbook.io/data-structure-and-algorithms/data-structure/data-stream-median.md).

# Data Stream Median

Numbers keep coming, return the median of numbers at every time a new number added.

## Clarification

What's the definition of Median?

Median is the number that in the middle of a sorted array. If there are n numbers in a sorted array A, the median is A\[(n - 1) / 2]. For example, if A=\[1,2,3], median is 2. If A=\[1,19], median is 1.

## Example

For numbers coming list: \[1, 2, 3, 4, 5], return \[1, 1, 2, 2, 3].

For numbers coming list: \[4, 5, 1, 3, 2, 6, 0], return \[4, 4, 4, 3, 3, 3, 3].

For numbers coming list: \[2, 20, 100], return \[2, 2, 20].

## Solution:

Python 默认是小顶堆的，所以自动排成1 2 3 4 5 6 peek 为第一个元素

可以 乘 -1 放入队列， 这样小顶堆自动变成了大顶堆， 取出的时候 \* -1

大小 顶堆的思路，尽量让大顶堆最多只比小顶堆多一个元素，这样中值无论奇偶都是当前的大顶堆的peek()

12345 -> 6789 median = 5

1234 -> 6789 median = 4

这道题给我们一个数据流，让我们找出中位数，由于数据流中的数据并不是有序的，所以我们首先应该想个方法让其有序。如果我们用vector来保存数据流的话，每进来一个新数据都要给数组排序，很不高效。所以之后想到用multiset这个数据结构，是有序保存数据的，但是它不能用下标直接访问元素，找中位数也不高效。这里用到的解法十分巧妙，我们使用大小堆来解决问题，其中大堆保存右半段较大的数字，小堆保存左半段较小的数组。这样整个数组就被中间分为两段了，由于堆的保存方式是由大到小，我们希望大堆里面的数据是从小到大，这样取第一个来计算中位数方便。我们用到一个小技巧，就是存到大堆里的数先取反再存，这样由大到小存下来的顺序就是实际上我们想要的从小到大的顺序。当大堆和小堆中的数字一样多时，我们取出大堆小堆的首元素求平均值，当小堆元素多时，取小堆首元素为中位数

```
import heapq
class Solution:
    """
    @param nums: A list of integers.
    @return: The median of numbers
    """
    def medianII(self, nums):
        if nums == None or len(nums) == 0:
            return 0
        # write your code here
        maxheap, minheap = [], []
        res = [0] * len(nums)

        heapq.heappush(maxheap, nums[0] * -1)
        res[0] = nums[0]

        for i in range(1, len(nums)):
            x = nums[i]
            if nums[i] <= maxheap[0] * -1 :
                heapq.heappush(maxheap, nums[i] * -1)
            else:
                heapq.heappush(minheap, nums[i])

            if len(maxheap) - len(minheap) > 1:
                heapq.heappush(minheap, heapq.heappop(maxheap) * -1 )
            elif len(maxheap) < len(minheap):
                heapq.heappush(maxheap, heapq.heappop(minheap) * -1 )

            res[i] = maxheap[0] * -1
        return res
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://liuxue2010.gitbook.io/data-structure-and-algorithms/data-structure/data-stream-median.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
