1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
|
<span style="float:right;">[[source]](https://github.com/keras-team/keras/blob/master/keras/layers/embeddings.py#L16)</span>
### Embedding
```python
keras.layers.Embedding(input_dim, output_dim, embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None)
```
Turns positive integers (indexes) into dense vectors of fixed size.
eg. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]]
This layer can only be used as the first layer in a model.
__Example__
```python
model = Sequential()
model.add(Embedding(1000, 64, input_length=10))
# the model will take as input an integer matrix of size (batch, input_length).
# the largest integer (i.e. word index) in the input should be
# no larger than 999 (vocabulary size).
# now model.output_shape == (None, 10, 64), where None is the batch dimension.
input_array = np.random.randint(1000, size=(32, 10))
model.compile('rmsprop', 'mse')
output_array = model.predict(input_array)
assert output_array.shape == (32, 10, 64)
```
__Arguments__
- __input_dim__: int > 0. Size of the vocabulary,
i.e. maximum integer index + 1.
- __output_dim__: int >= 0. Dimension of the dense embedding.
- __embeddings_initializer__: Initializer for the `embeddings` matrix
(see [initializers](../initializers.md)).
- __embeddings_regularizer__: Regularizer function applied to
the `embeddings` matrix
(see [regularizer](../regularizers.md)).
- __activity_regularizer__: Regularizer function applied to
the output of the layer (its "activation").
(see [regularizer](../regularizers.md)).
- __embeddings_constraint__: Constraint function applied to
the `embeddings` matrix
(see [constraints](../constraints.md)).
- __mask_zero__: Whether or not the input value 0 is a special "padding"
value that should be masked out.
This is useful when using [recurrent layers](recurrent.md)
which may take variable length input.
If this is `True` then all subsequent layers
in the model need to support masking or an exception will be raised.
If mask_zero is set to True, as a consequence, index 0 cannot be
used in the vocabulary (input_dim should equal size of
vocabulary + 1).
- __input_length__: Length of input sequences, when it is constant.
This argument is required if you are going to connect
`Flatten` then `Dense` layers upstream
(without it, the shape of the dense outputs cannot be computed).
__Input shape__
2D tensor with shape: `(batch_size, sequence_length)`.
__Output shape__
3D tensor with shape: `(batch_size, sequence_length, output_dim)`.
__References__
- [A Theoretically Grounded Application of Dropout in
Recurrent Neural Networks](http://arxiv.org/abs/1512.05287)
|