The Arabic Parallel Gender Corpus (APGC) is designed to
support research on gender bias and personalization in natural language processing
applications working on Arabic. The corpus comes in three versions v1.0, v2.0, and v2.1.
APGC v1.0 includes only first-person sentences and was presented in the 2019 paper on "Automatic Gender Identification and Reinflection in Arabic"
by Habash et al. in the First workshop on Gender Bias in Natural
Language Processing.
APGC v2.0 expands on v1.0 by adding 2nd person targets
as well as increasing the total number of sentences over 6.5 times,
reaching over 590K words. AGPC v2.0 was introduced in the 2022 paper on "The Arabic Parallel Gender Corpus 2.0: Extensions and Analyses" by Alhafni et al.
APGC v2.1 extends the word-level annotations in v2.0 by marking the genders of both the base words and their pronominal enclitics.
AGPC v2.1 was introduced in the 2022 paper on "User-Center Gender Rewriting" by Alhafni et al.
The Arabic Parallel Gender Corpus was developed at the Computational Approaches to Modeling Language (CAMeL) Lab in New York University Abu Dhabi.
By downloading the The Arabic Parallel Gender Corpus files from HERE (2.5 MB) you agree to the terms of the license below.
//////////////////////////////////////////////////////////////////////////////
// License for The Arabic Parallel Gender Corpus v1.0
//////////////////////////////////////////////////////////////////////////////
Copyright 2019 New York University Abu Dhabi. All Rights Reserved. A license to
use and copy this software, data and its documentation solely for your
internal research and evaluation purposes, without fee and without a
signed licensing agreement, is hereby granted upon your download of
the software, through which you agree to the following: 1) the above
copyright notice, this paragraph and the following three paragraphs
will prominently appear in all internal copies and modifications; 2)
no rights to sublicense or further distribute this software are
granted; 3) no rights to modify this software are granted; and 4) no
rights to assign this license are granted. Please Contact the Office
of Industrial Liaison, New York University, One Park Avenue, 6th
Floor, New York, NY 10016 (212) 263-8178, for commercial licensing
opportunities, or for further distribution, modification or license
rights.
Created by Nizar Habash and Christine Chung at the Computational
Approaches to Modeling Language (CAMeL) Lab in New York University
Abu Dhabi.
IN NO EVENT SHALL NYU, OR ITS EMPLOYEES, OFFICERS, AGENTS OR TRUSTEES
("COLLECTIVELY "NYU PARTIES") BE LIABLE TO ANY PARTY FOR DIRECT,
INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY KIND ,
INCLUDING LOST PROFITS, ARISING OUT OF ANY CLAIM RESULTING FROM YOUR
USE OF THIS SOFTWARE, DATA AND ITS DOCUMENTATION, EVEN IF ANY OF NYU
PARTIES HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH CLAIM OR DAMAGE.
NYU SPECIFICALLY DISCLAIMS ANY WARRANTIES OF ANY KIND REGARDING THE
SOFTWARE and DATA, INCLUDING, BUT NOT LIMITED TO, NON-INFRINGEMENT,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE, OR THE ACCURACY OR USEFULNESS, OR COMPLETENESS OF THE
SOFTWARE. THE SOFTWARE AND ACCOMPANYING DOCUMENTATION, IF ANY,
PROVIDED HEREUNDER IS PROVIDED COMPLETELY "AS IS". REGENTS HAS NO
OBLIGATION TO PROVIDE FURTHER DOCUMENTATION, MAINTENANCE, SUPPORT,
UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
Please cite Habash et al. (2019) if you use The Arabic Parallel Gender
Corpus in your research:
Habash, Nizar, Houda Bouamor, Christine Chung. 2019. Automatic Gender
Identification and Reinflection in Arabic. In Proceedings of the First
Workshop on Gender Bias in Natural Language Processing, Florence, Italy.
//////////////////////////////////////////////////////////////////////////////
By downloading the The Arabic Parallel Gender Corpus files from HERE (38 MB) you agree to the terms of the license below.
//////////////////////////////////////////////////////////////////////////////
// License for The Arabic Parallel Gender Corpus v2.0
//////////////////////////////////////////////////////////////////////////////
Copyright 2022 New York University Abu Dhabi. All Rights Reserved. A license to
use and copy this software, data and its documentation solely for your
internal research and evaluation purposes, without fee and without a
signed licensing agreement, is hereby granted upon your download of
the software, through which you agree to the following: 1) the above
copyright notice, this paragraph and the following three paragraphs
will prominently appear in all internal copies and modifications; 2)
no rights to sublicense or further distribute this software are
granted; 3) no rights to modify this software are granted; and 4) no
rights to assign this license are granted. Please Contact the Office
of Industrial Liaison, New York University, One Park Avenue, 6th
Floor, New York, NY 10016 (212) 263-8178, for commercial licensing
opportunities, or for further distribution, modification or license
rights.
Created by Bashar Alhafni, Nizar Habash, and Houda Bouamor at
the Computational Approaches to Modeling Language (CAMeL) Lab in
New York University Abu Dhabi.
IN NO EVENT SHALL NYU, OR ITS EMPLOYEES, OFFICERS, AGENTS OR TRUSTEES
("COLLECTIVELY "NYU PARTIES") BE LIABLE TO ANY PARTY FOR DIRECT,
INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY KIND ,
INCLUDING LOST PROFITS, ARISING OUT OF ANY CLAIM RESULTING FROM YOUR
USE OF THIS SOFTWARE, DATA AND ITS DOCUMENTATION, EVEN IF ANY OF NYU
PARTIES HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH CLAIM OR DAMAGE.
NYU SPECIFICALLY DISCLAIMS ANY WARRANTIES OF ANY KIND REGARDING THE
SOFTWARE and DATA, INCLUDING, BUT NOT LIMITED TO, NON-INFRINGEMENT,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE, OR THE ACCURACY OR USEFULNESS, OR COMPLETENESS OF THE
SOFTWARE. THE SOFTWARE AND ACCOMPANYING DOCUMENTATION, IF ANY,
PROVIDED HEREUNDER IS PROVIDED COMPLETELY "AS IS". REGENTS HAS NO
OBLIGATION TO PROVIDE FURTHER DOCUMENTATION, MAINTENANCE, SUPPORT,
UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
Please cite Alhafni et al. (2022) if you use The Arabic Parallel Gender
Corpus v2.0 in your research:
Alhafni, Bashar, Nizar Habash, Houda Bouamor. 2022. The Arabic Parallel Gender
Corpus 2.0: Extensions and analyses. In Proceedings of the 13th Language Resources
and Evaluation Conference (LREC), Marseille, France.
//////////////////////////////////////////////////////////////////////////////
By downloading the The Arabic Parallel Gender Corpus files from HERE (215 MB) you agree to the terms of the license below.
//////////////////////////////////////////////////////////////////////////////
// License for The Arabic Parallel Gender Corpus v2.1
//////////////////////////////////////////////////////////////////////////////
Copyright 2022 New York University Abu Dhabi. All Rights Reserved. A license to
use and copy this software, data and its documentation solely for your
internal research and evaluation purposes, without fee and without a
signed licensing agreement, is hereby granted upon your download of
the software, through which you agree to the following: 1) the above
copyright notice, this paragraph and the following three paragraphs
will prominently appear in all internal copies and modifications; 2)
no rights to sublicense or further distribute this software are
granted; 3) no rights to modify this software are granted; and 4) no
rights to assign this license are granted. Please Contact the Office
of Industrial Liaison, New York University, One Park Avenue, 6th
Floor, New York, NY 10016 (212) 263-8178, for commercial licensing
opportunities, or for further distribution, modification or license
rights.
Created by Bashar Alhafni, Nizar Habash, and Houda Bouamor at
the Computational Approaches to Modeling Language (CAMeL) Lab in
New York University Abu Dhabi.
IN NO EVENT SHALL NYU, OR ITS EMPLOYEES, OFFICERS, AGENTS OR TRUSTEES
("COLLECTIVELY "NYU PARTIES") BE LIABLE TO ANY PARTY FOR DIRECT,
INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY KIND ,
INCLUDING LOST PROFITS, ARISING OUT OF ANY CLAIM RESULTING FROM YOUR
USE OF THIS SOFTWARE, DATA AND ITS DOCUMENTATION, EVEN IF ANY OF NYU
PARTIES HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH CLAIM OR DAMAGE.
NYU SPECIFICALLY DISCLAIMS ANY WARRANTIES OF ANY KIND REGARDING THE
SOFTWARE and DATA, INCLUDING, BUT NOT LIMITED TO, NON-INFRINGEMENT,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE, OR THE ACCURACY OR USEFULNESS, OR COMPLETENESS OF THE
SOFTWARE. THE SOFTWARE AND ACCOMPANYING DOCUMENTATION, IF ANY,
PROVIDED HEREUNDER IS PROVIDED COMPLETELY "AS IS". REGENTS HAS NO
OBLIGATION TO PROVIDE FURTHER DOCUMENTATION, MAINTENANCE, SUPPORT,
UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
Please cite Alhafni et al. (2022) if you use The Arabic Parallel Gender
Corpus v2.1 in your research:
Alhafni, Bashar, Nizar Habash, Houda Bouamor. 2022. User-Centric Gender
Rewriting. In Proceedings of the 2022 Conference of the North American Chapter of
the Association for Computational Linguistics, Seattle, Washington.
//////////////////////////////////////////////////////////////////////////////