@NotThreadSafe public abstract class SortBuffer extends Object implements DataBuffer
DataBuffer
implementation which sorts all appended records only by subpartition index.
Records of the same subpartition keep the appended order.
It maintains a list of MemorySegment
s as a joint buffer. Data will be appended to the
joint buffer sequentially. When writing a record, an index entry will be appended first. An index
entry consists of 4 fields: 4 bytes for record length, 4 bytes for Buffer.DataType
and 8
bytes for address pointing to the next index entry of the same subpartition which will be used to
index the next record to read when coping data from this DataBuffer
. For simplicity, no
index entry can span multiple segments. The corresponding record data is seated right after its
index entry and different from the index entry, records have variable length thus may span
multiple segments.
Modifier and Type | Field and Description |
---|---|
protected BufferRecycler |
bufferRecycler
BufferRecycler used to recycle freeSegments . |
protected LinkedList<MemorySegment> |
freeSegments
A list of
MemorySegment s used to store data in memory. |
protected static int |
INDEX_ENTRY_SIZE
Size of an index entry: 4 bytes for record length, 4 bytes for data type and 8 bytes for
pointer to next entry.
|
protected boolean |
isFinished
Whether this sort buffer is finished.
|
protected boolean |
isReleased
Whether this sort buffer is released.
|
protected long[] |
lastIndexEntryAddresses
Addresses of the last record's index entry for each subpartition.
|
protected long |
numTotalBytesRead
Total number of bytes already read from this sort buffer.
|
protected long |
readIndexEntryAddress
Index entry address of the current record or event to be read.
|
protected int |
readOrderIndex
Used to index the current available subpartition to read data from.
|
protected int |
recordRemainingBytes
Record bytes remaining after last copy, which must be read first in next copy.
|
ArrayList<MemorySegment> |
segments
A segment list as a joint buffer which stores all records and index entries.
|
protected int[] |
subpartitionReadOrder
Data of different subpartitions in this sort buffer will be read in this order.
|
Modifier | Constructor and Description |
---|---|
protected |
SortBuffer(LinkedList<MemorySegment> freeSegments,
BufferRecycler bufferRecycler,
int numSubpartitions,
int bufferSize,
int numGuaranteedBuffers,
int[] customReadOrder) |
Modifier and Type | Method and Description |
---|---|
boolean |
append(ByteBuffer source,
int targetSubpartition,
Buffer.DataType dataType)
No partial record will be written to this
SortBasedDataBuffer , which means that
either all data of target record will be written or nothing will be written. |
protected int |
copyRecordOrEvent(MemorySegment targetSegment,
int targetSegmentOffset,
int sourceSegmentIndex,
int sourceSegmentOffset,
int recordLength) |
void |
finish()
Finishes this
DataBuffer which means no record can be appended anymore. |
protected int |
getSegmentIndexFromPointer(long value) |
protected int |
getSegmentOffsetFromPointer(long value) |
boolean |
hasRemaining()
Returns true if not all data appended to this
DataBuffer is consumed. |
boolean |
isFinished()
Whether this
DataBuffer is finished or not. |
boolean |
isReleased()
Whether this
DataBuffer is released or not. |
long |
numTotalBytes()
Returns the total number of bytes written to this
DataBuffer . |
long |
numTotalRecords()
Returns the total number of records written to this
DataBuffer . |
void |
release()
Releases this
DataBuffer which releases all resources. |
boolean |
returnFreeSegments(int numFreeSegments)
Try to release some unused memory segments.
|
protected void |
updateReadSubpartitionAndIndexEntryAddress() |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getNextBuffer
protected static final int INDEX_ENTRY_SIZE
protected final LinkedList<MemorySegment> freeSegments
MemorySegment
s used to store data in memory.protected final BufferRecycler bufferRecycler
BufferRecycler
used to recycle freeSegments
.public final ArrayList<MemorySegment> segments
protected final long[] lastIndexEntryAddresses
protected long numTotalBytesRead
protected boolean isFinished
protected boolean isReleased
protected final int[] subpartitionReadOrder
protected long readIndexEntryAddress
protected int recordRemainingBytes
protected int readOrderIndex
protected SortBuffer(LinkedList<MemorySegment> freeSegments, BufferRecycler bufferRecycler, int numSubpartitions, int bufferSize, int numGuaranteedBuffers, @Nullable int[] customReadOrder)
public boolean append(ByteBuffer source, int targetSubpartition, Buffer.DataType dataType) throws IOException
SortBasedDataBuffer
, which means that
either all data of target record will be written or nothing will be written.append
in interface DataBuffer
IOException
public boolean returnFreeSegments(int numFreeSegments)
Note that this class is not thread safe, so please make sure to call append(ByteBuffer source, int targetSubpartition, Buffer.DataType dataType)
and this method
with lock acquired.
numFreeSegments
- the number of segments to be released.protected int copyRecordOrEvent(MemorySegment targetSegment, int targetSegmentOffset, int sourceSegmentIndex, int sourceSegmentOffset, int recordLength)
protected void updateReadSubpartitionAndIndexEntryAddress()
protected int getSegmentIndexFromPointer(long value)
protected int getSegmentOffsetFromPointer(long value)
public long numTotalRecords()
DataBuffer
DataBuffer
.numTotalRecords
in interface DataBuffer
public long numTotalBytes()
DataBuffer
DataBuffer
.numTotalBytes
in interface DataBuffer
public boolean hasRemaining()
DataBuffer
DataBuffer
is consumed.hasRemaining
in interface DataBuffer
public void finish()
DataBuffer
DataBuffer
which means no record can be appended anymore.finish
in interface DataBuffer
public boolean isFinished()
DataBuffer
DataBuffer
is finished or not.isFinished
in interface DataBuffer
public void release()
DataBuffer
DataBuffer
which releases all resources.release
in interface DataBuffer
public boolean isReleased()
DataBuffer
DataBuffer
is released or not.isReleased
in interface DataBuffer
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.